Skip to main content

Table 2 Data sets used in experiments

From: Improved cost-sensitive representation of data for solving the imbalanced big data classification problem

 

Data set

Samples

Features

Classes

Imbalance ratio

1

Cancer

699

9

2

1.9

2

Wine

178

13

3

1.4

3

Ionosphere

351

34

2

1.7

4

Pima Indian diabetes

768

8

2

1.8

5

Iris

150

4

3

1

6

Wdbc

569

30

2

1.6

7

Cleveland

303

13

5

12.6

8

Musk

476

166

2

1.2

9

Dermatology-6

366

34

6

5.6

10

FuelCons

1764

37

4

13.08

11

Movement_libras

270

90

15

1

12

Sonar

208

60

2

1.14

13

SPECTF

267

44

2

3.85

14

Colon tumor

62

166

2

1.81

15

DLBCL77

77

5469

2

3.05

16

Mnist

10,000

784

10

5.99

17

Caltech101

8671

784

101

25.74

18

Kddcup-rootkit-imap-vs-back

2225

41

2

100.13

19

Kddcup-buffer-overflow-vs-back

2233

41

2

73.44

20

Kddcup-guess-passwd-vs-satan

1642

41

2

29.98

21

Kddcup-land-vs-satan

1610

41

2

75.66