From: Big data quality framework: a holistic approach to continuous quality management
# | DQP Operation | Description | DQP Level | Related DQP Data |
---|---|---|---|---|
BDQP | Create | New big data quality project | 0 | Metadata, Quality Requirements, … |
Re-use | An existing BDQP | All | ||
1 | Add | Data sampling strategy | 0 | Sampling parameters |
1 | Add/update | Data profiling | 1 | Data profile (schema, statistical metric ratios scores) |
2 | Add/update | EQP (Predefined quality scenarios actions) | 2 | EQP parameters QR Proposals List |
Add | Qualitative QE (PCA, Feature Selection, etc.) | QLQE parameters (Attributes Sets) | ||
Update | QLQE attributes sets combination | (Combined Set) | ||
3 | Add/update | Mapping attributes and DQD’s evaluation settings parameters | 3 | (DQES) |
4 | Update | Samples quantitative QE of DQD | 4 | QTQE results (DQES + Scores) |
Re-use/update | DQES Reused for QTQE of Pre-processed Samples (S’) | 7 | (S’ DQES + Scores) | |
5 | Control | S DQD Scores vs Requirements S’ DQD Scores vs Requirements | 5 7 | (Valid and Invalid Scores) |
6 | Add | Quality rules discovery from S DQES + Scores | 6 | (Quality Rules List) |
7 | Apply | Quality rules application by pre-processing Samples | 7 | Pre-processed Samples set S’ |
7 | Validate | Analyze and check valid rules | 7 | (Valid and Invalid Quality Rules) |
8 | Optimize | Valid quality rules optimization | 8 | (QR optimized) |
9 | Apply | Big data pre-processing using optimized quality rules list | NA | New pre-processed Dataset DS’ |
10 | Re-use/control/update | QTQE using DQES for DS’ Samples, Score control | 10 | Quality report |