Skip to main content

Table 3 Empirical results

From: Examining characteristics of predictive models with imbalanced big data

Learners

Ratios

Case study 1: ECBDL’14 features subsets

Case study 2: POST features subsets

60

120

ALL

5

10

20

36

ALL

(a) GBT

(40:60)

0.4884

0.4891

0.4947

0.8188

0.8636

0.8503

0.8540

0.7527

(45:55)

0.4873

0.4883

0.4905

0.8109

0.8570

0.7795

0.8468

0.7541

(50:50)

0.4671

0.4737

0.4777

0.8423

0.8397

0.6933

0.8233

0.7452

(65:35)

0.3423

0.3460

0.3470

0.8240

0.8706

0.5977

0.5935

0.5568

(75:25)

0.2175

0.2285

0.2243

0.8180

0.8651

0.3685

0.2949

0.5133

(90:10)

0.0384

0.0382

0.0353

0.8183

0.4815

0.4046

0.3273

0.3415

(b) RF

(40:60)

0.4529

0.4460

0.4218

0.0334

0.8789

0.8678

0.8793

0.8393

(45:55)

0.4784

0.4788

0.4788

0

0.8789

0.8563

0.7515

0.8226

(50:50)

0.4496

0.4470

0.4412

0

0.8820

0.0140

0.0830

0.0382

(65:35)

0.2051

0.1799

0.1364

0.0798

0

0.0762

0.1237

0.1940

(75:25)

0.0750

0.0469

0.0190

0.8547

0.8539

0.8532

0.8497

0.8596

(90:10)

0

0

0

0.8966

0.9061

0.8196

0.7375

0

(c) LR

(40:60)

0.4452

0.4503

0.4568

0.8770

0.6210

0.8302

0.4697

0.3113

(45:55)

0.4663

0.4701

0.4754

0.8773

0.6920

0.5852

0.4620

0.5935

(50:50)

0.4680

0.4731

0.4758

0.8779

0.4988

0.4487

0.5528

0.4920

(65:35)

0.3562

0.3712

0.3740

0.8787

0.6106

0.5018

0.5030

0.5667

(75:25)

0.2131

0.2328

0.2381

0.8789

0.4026

0.3307

0.3217

0.4750

(90:10)

0.0153

0.0196

0.0230

0.7514

0.1099

0.1532

0.2036

0.1062

  1. The highest value within each column (features-set) of each sub-table is in italics and the highest value within each row (class distribution ratio) of each sub-table is underlined