From: Understanding big data themes from scientific biomedical literature through topic modeling
Theme name | Theme description | Definition sources | |
---|---|---|---|
I | Volume, size, voluminous, cardinality | Large quantities of data in number of bytes; size of available data (e.g. all records instead of a sample); beyond conventional storage techniques; number of records at a particular instance | |
Velocity, continuity | Flow rate at which data is created, stored, analysed, and visualised; increased through invention of new data streams such as social media; beyond conventional means of processing, needing new techniques such as streaming; growth of data over time | ||
Variety, complexity | Many different types of data; not bound to a traditional data format; format changes over time; heterogeneous and unstructured data | ||
Veracity | Trustworthiness of data; reliability of data quality and gathering environment | ||
Value | Worth/relevancy of data (e.g. economic, individual/privacy, societal, humanity value) | ||
Variability | Consistency of data over time; influences which systematically change data measures over time | ||
II | Information | Where signals are turned into data (e.g. book digitalisation, or gathering from personal device measurements) | [14] |
Technology | Tools, systems, and software (e.g. scalable processing and transmission systems such as Hadoop) | ||
Methods | Procedures and their application (e.g. clustering, natural language processing, machine learning, neural networks, visualisation) | ||
Impact | Ethical, business, societal | [14] | |
III | Beyond conventional | Data whose size call for methods beyond the tried-and-true; necessity of scalable systems for storage, processing, manipulation, analysis, visualisation | |
IV | Application | About the application domain treated in the papers | – |