Skip to main content


Table 1 Big datasets used for empirical evaluation

From: Selecting a representative decision tree from an ensemble of decision-tree models for fast big data classification

ID UCI dataset name Samples Attributes Classes
DS1 Poker Hand—consisting of five playing cards 1,025,010 11 9
DS2 SUSY —Monte Carlo simulations of kinematic properties measured by the particle detectors 5,000,000 18 2
DS3 Record Linkage Comparison Patterns—decide from a comparison pattern whether the underlying records belong to one person 5,749,132 9 2
DS4 KDD Cup 1999—build a network intrusion detector 4,898,431 42 23
DS5 Individual household electric power consumption 2,075,259 9 Continuous
DS6 HIGGS—distinguish between a signal process, which produces Higgs bosons, and a background process 11,000,000 28 2