Skip to main content

Table 4 List of highly cited papers

From: A bibliometric approach to tracking big data research trends

Title Authors Year NR TC (Rank) Refs.
Trends in big data analytics Kambatla et al. 2014 75 6 (27) [50]
Big data: a survey Chen et al. 2014 155 7 (26) [6]
A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems Zhang et al. 2014 46 9 (24) [51]
A scalable two-phase top-down specialization approach for data anonymization using MapReduce on cloud Zhang et al. 2014 31 6 (27) [52]
Data mining with big data Wu et al. 2014 56 12 (23) [1]
Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis Balahur and Turchi 2014 39 9 (24) [53]
Techniques and applications for sentiment analysis Feldman 2013 39 19 (20) [54]
New avenues in opinion mining and sentiment analysis Cambria et al. 2013 33 41 (18) [55]
Review of performance metrics for green data centers: a taxonomy study Wang and Khan 2013 43 18 (21) [56]
G-Hadoop: MapReduce across distributed data centers for data-intensive computing Wang et al. 2013 39 27 (19) [57]
Data center network virtualization: a survey Bari et al. 2013 67 17 (22) [58]
Business intelligence and analytics: from big data to big impact Chen et al. 2012 68 53 (15) [59]
Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing Beloglazov et al. 2012 39 88 (12) [60]
A survey on optical interconnects for data centers Kachris and Tomkos 2012 64 49 (16) [61]
Scikit-learn: machine learning in python Pedregosa et al. 2011 16 299 (2) [62]
Lexicon-based methods for sentiment analysis Taboada et al. 2011 120 64 (14) [63]
MapReduce: a flexible data processing tool Dean and Ghemawat 2010 14 110 (11) [64]
Faster and better: a machine learning approach to corner detection Rosten et al. 2010 102 156 (7) [65]
VL2: a scalable and flexible data center network Greenberg et al. 2009 23 121 (10) [66]
A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability Garcia et al. 2009 46 160 (5) [67]
Improving the performance of predictive process modeling for large datasets Finley et al. 2009 17 47 (17) [68]
CloudBurst: highly sensitive read mapping with MapReduce Schatz 2009 20 146 (9) [69]
A scalable, commodity data center network architecture Al-Fares et al. 2008 33 148 (8) [70]
MapReduce: simplified data processing on large clusters Dean and Ghemawat 2008 15 1249 (1) [71]
Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning Ishibuchi and Nojima 2007 33 158 (6) [72]
A machine learning information retrieval approach to protein fold recognition Cheng and Baldi 2006 83 86 (13) [73]
Machine learning for high-speed corner detection Rosten and Drummond 2006 35 251 (3) [74]
Predicting subcellular localization of proteins using machine-learned classifiers Lu et al. 2004 21 193 (4) [75]
  1. NR Cited reference count, TC Web of science core collection times cited count, Refs References