From: Theory-driven or process-driven prediction? Epistemological challenges of big data analytics
BDA step | Critical questions | Epistemological challenge | Possible lightweight theory-driven guidance |
---|---|---|---|
Acquisition | What data do I need? What kinds of data [sets] are available/to be selected? | Data ‘sampling’ | Apply data summarization, graphical representation, dimension reduction (e.g. PCA) and outlier detection Ensure multi-expert and multi-disciplinarily participation in data reduction and selection Trace and examine all stages of extract, transform, load, and merge for completeness, correctness, and consistency |
Pre-processing | How can data [sets] be represented and processed without falsification or insight loss? | Data validity and reliability | |
Analytics | Which method[s] to use? What rules govern conclusions from these data [sets]? | Knowledge discovery | Map the constructs of analytics to known theoretical concepts Ensure multi-expert and multi-disciplinarily participation in parameter selection and mapping analytical constructs with theoretical concepts Develop/apply theoretical framework for choice of techniques (mining, machine learning, statistics) or models |
Interpretation | How to interpret such conclusion? | Non-/interpretability; reliability of prediction | Develop/apply theoretical framework for result interpretation |