Skip to main content

Table 3 Used datasets

From: The non-linear nature of the cost of comprehensibility

Dataset

# observations

# features

Imbalance

Adult

48842

14

0.27

Agaricus_lepiota

8145

22

0

Analcatdata_aids

50

4

0

Analcatdata_asbestos

83

3

0.01

Analcatdata_bankruptcy

50

6

0

Analcatdata_boxing1

120

3

0.09

Analcatdata_boxing2

132

3

0.01

Analcatdata_creditscore

100

6

0.21

Analcatdata_cyyoung8092

97

10

0.26

Analcatdata_cyyoung9302

92

10

0.34

Analcatdata_fraud

42

11

0.15

Analcatdata_japansolvent

52

9

0

Analcatdata_lawsuit

264

4

0.73

Appendicitis

106

7

0.36

Australian

690

14

0.01

Backache

180

32

0.52

Biomed

209

8

0.08

Breast_cancer_wisconsin

569

30

0.06

Breast_cancer

286

9

0.16

Breast_w

699

9

0.1

Breast

699

10

0.1

BuggyCrx

690

15

0.01

Bupa

345

5

0

Chess

3196

36

0

Churn

5000

20

0.51

Clean1

476

168

0.02

Clean2

6598

168

0.48

Cleve

303

13

0.01

Coil2000

9822

85

0.78

Colic

368

22

0.07

Corral

160

6

0.02

Credit_a

690

15

0.01

Credit_g

1000

20

0.16

crx

690

15

0.01

Diabetes

768

8

0.09

Dis

3772

29

0.94

Flare

1066

10

0.43

GAMETES_Epistasis_2_Way_1000atts

_0.4H_EDM_1_EDM_1_1

1600

1000

0

GAMETES_Epistasis_2_Way_20atts

_0.1H_EDM_1_1

1600

20

0

GAMETES_Epistasis_2_Way_20atts

_0.4H_EDM_1_1

1600

20

0

GAMETES_Epistasis_3_Way_20atts

_0.2H_EDM_1_1

1600

20

0

GAMETES_Heterogeneity_20atts

_1600_Het_0.4_0.2_50_EDM_2_001

1600

20

0

GAMETES_Heterogeneity_20atts

_1600_Het_0.4_0.2_75_EDM_2_001

1600

20

0

German

1000

20

0.16

Glass2

163

9

0

Haberman

306

3

0.22

Heart_c

303

13

0.01

Heart_h

294

13

0.08

Heart_statlog

270

13

0.01

Hepatitis

155

19

0.34

Hill_Valley_with_noise

1212

100

0

Hill_Valley_without_noise

1212

100

0

Horse_colic

368

22

0.07

House_votes_84

435

16

0.05

Hungarian

294

13

0.08

Hypothyroid

3163

25

0.82

Ionosphere

351

34

0.08

Irish

500

5

0.01

kr_vs_kp

3196

36

0

Labor

57

16

0.09

Lupus

87

3

0.04

Magic

19020

10

0.09

Mofn_3_7_10

1324

10

0.31

Molecular_biology_promoters

106

57

0

Monk1

556

6

0

Monk2

601

6

0.1

Monk3

554

6

0

Mushroom

8124

22

0

Mux6

128

6

0

Parity5

32

5

0

Parity5+5

1124

10

0

Phoneme

5404

5

0.17

Pima

768

8

0.09

Postoperative_patient_data

88

8

0.21

Prnn_crabs

200

7

0

Prnn_synth

250

2

0

Profb

672

9

0.11

Ring

7400

20

0

Saheart

462

9

0.09

Sonar

208

60

0

Spambase

4601

57

0.04

Spect

267

22

0.35

Spectf

349

44

0.21

ThreeOf9

512

9

0

Tic_tac_toe

958

9

0.09

Tokyo1

959

44

0.08

Twonorm

7400

20

0

Vote

435

16

0.05

Wdbc

569

30

0.06

Xd6

973

9

0.11