Skip to main content


Table 1 Lightweight theory-driven guidance for the BDA process

From: Theory-driven or process-driven prediction? Epistemological challenges of big data analytics

BDA step Critical questions Epistemological challenge Possible lightweight theory-driven guidance
Acquisition What data do I need?
What kinds of data [sets] are available/to be selected?
Data ‘sampling’ Apply data summarization, graphical representation, dimension reduction (e.g. PCA) and outlier detection
Ensure multi-expert and multi-disciplinarily participation in data reduction and selection
Trace and examine all stages of extract, transform, load, and merge for completeness, correctness, and consistency
Pre-processing How can data [sets] be represented and processed without falsification or insight loss? Data validity and reliability
Analytics Which method[s] to use?
What rules govern conclusions from these data [sets]?
Knowledge discovery Map the constructs of analytics to known theoretical concepts
Ensure multi-expert and multi-disciplinarily participation in parameter selection and mapping analytical constructs with theoretical concepts
Develop/apply theoretical framework for choice of techniques (mining, machine learning, statistics) or models
Interpretation How to interpret such conclusion? Non-/interpretability; reliability of prediction Develop/apply theoretical framework for result interpretation