Skip to main content

Table 6 Data quality profile levels

From: Big data quality framework: a holistic approach to continuous quality management

#

DQP Operation

Description

DQP Level

Related DQP Data

BDQP

Create

New big data quality project

0

Metadata, Quality Requirements, …

Re-use

An existing BDQP

All

1

Add

Data sampling strategy

0

Sampling parameters

1

Add/update

Data profiling

1

Data profile (schema, statistical metric ratios scores)

2

Add/update

EQP (Predefined quality scenarios actions)

2

EQP parameters

QR Proposals List

Add

Qualitative QE (PCA, Feature Selection, etc.)

QLQE parameters

(Attributes Sets)

Update

QLQE attributes sets combination

(Combined Set)

3

Add/update

Mapping attributes and DQD’s evaluation settings parameters

3

(DQES)

4

Update

Samples quantitative QE of DQD

4

QTQE results (DQES + Scores)

Re-use/update

DQES Reused for QTQE of Pre-processed Samples (S’)

7

(S’ DQES + Scores)

5

Control

S DQD Scores vs Requirements

S’ DQD Scores vs Requirements

5

7

(Valid and Invalid Scores)

6

Add

Quality rules discovery from S DQES + Scores

6

(Quality Rules List)

7

Apply

Quality rules application by pre-processing Samples

7

Pre-processed Samples set S’

7

Validate

Analyze and check valid rules

7

(Valid and Invalid Quality Rules)

8

Optimize

Valid quality rules optimization

8

(QR optimized)

9

Apply

Big data pre-processing using optimized quality rules list

NA

New pre-processed Dataset DS’

10

Re-use/control/update

QTQE using DQES for DS’ Samples, Score control

10

Quality report