Skip to main content

Table 1 Big datasets used for empirical evaluation

From: Selecting a representative decision tree from an ensemble of decision-tree models for fast big data classification

ID

UCI dataset name

Samples

Attributes

Classes

DS1

Poker Hand—consisting of five playing cards

1,025,010

11

9

DS2

SUSY —Monte Carlo simulations of kinematic properties measured by the particle detectors

5,000,000

18

2

DS3

Record Linkage Comparison Patterns—decide from a comparison pattern whether the underlying records belong to one person

5,749,132

9

2

DS4

KDD Cup 1999—build a network intrusion detector

4,898,431

42

23

DS5

Individual household electric power consumption

2,075,259

9

Continuous

DS6

HIGGS—distinguish between a signal process, which produces Higgs bosons, and a background process

11,000,000

28

2