Skip to main content

Table 4 Case 1: Medicare results

From: Severely imbalanced Big Data challenges: investigating data sampling approaches

Learner

Method

(All:all)

(99:1)

(90:10)

(75:25)

(65:35)

(50:50)

(a) AUC

 GBT

None

0.79047

–

–

–

–

–

RUS

–

0.80373

0.81675

0.80405

0.79127

0.77587

ROS

–

0.74328

0.62805

0.72565

0.76417

0.80703

ADASYN

–

0.71368

0.69611

0.69586

0.69675

0.69351

SMOTE

–

0.73903

0.72194

0.72634

0.72986

0.73439

SMOTEb1

–

0.68831

0.67235

0.65831

0.65448

0.66498

SMOTEb2

–

0.68917

0.67780

0.66209

0.66312

0.66730

 LR

None

0.81554

–

–

–

–

–

RUS

–

0.82011

0.81868

0.81553

0.80998

0.79415

ROS

–

0.66210

0.68306

0.75298

0.79036

0.81547

ADASYN

–

0.81205

0.81622

0.81758

0.81384

0.81578

SMOTE

–

0.81306

0.82211

0.82685

0.82781

0.82413

SMOTEb1

–

0.74471

0.73845

0.73526

0.74014

0.73484

SMOTEb2

–

0.72167

0.71599

0.72523

0.72752

0.72426

 RF

None

0.79383

–

–

–

–

–

RUS

–

0.81515

0.82793

0.81503

0.80619

0.79546

ROS

–

0.77538

0.75640

0.75728

0.76989

0.79315

ADASYN

–

0.74537

0.73496

0.72920

0.73266

0.73577

SMOTE

–

0.77417

0.76921

0.77629

0.77443

0.76790

SMOTEb1

–

0.76460

0.74777

0.75695

0.75844

0.75883

SMOTEb2

–

0.76440

0.75071

0.75155

0.75282

0.74967

(b) GM

 GBT

None

0.00907

–

–

–

–

–

RUS

–

0.08674

0.37061

0.60384

0.67830

0.70412

ROS

–

0.01234

0.14263

0.34824

0.50723

0.69501

ADASYN

–

0.00205

0.00413

0.05390

0.12527

0.30430

SMOTE

–

0.01027

0.06270

0.22959

0.33785

0.47255

SMOTEb1

–

0.03254

0.20534

0.28603

0.33670

0.40159

SMOTEb2

–

0.04371

0.18432

0.26794

0.32180

0.38951

 LR

None

0

–

–

–

–

–

RUS

–

0.13376

0.45411

0.66222

0.72088

0.73044

ROS

–

0.05917

0.36425

0.58388

0.67673

0.75224

ADASYN

–

0.06607

0.35955

0.59097

0.69207

0.74657

SMOTE

–

0.12602

0.45052

0.64526

0.71975

0.75345

SMOTEb1

–

0.13877

0.37785

0.50841

0.55796

0.59091

SMOTEb2

–

0.10953

0.35552

0.50170

0.54910

0.58911

 RF

None

0.00823

–

–

–

–

–

RUS

–

0.09315

0.26700

0.56838

0.67842

0.72590

ROS

–

0.00909

0.01027

0.03608

0.08623

0.29951

ADASYN

–

0.03665

0.10203

0.16093

0.20660

0.24092

SMOTE

–

0.04778

0.16448

0.23132

0.26808

0.31371

SMOTEb1

–

0.04571

0.08312

0.12967

0.14453

0.18056

SMOTEb2

–

0.03546

0.07203

0.10473

0.10693

0.14532

  1. The highest value within each column (class distribution ratio) of each sub-table is in italic type, and the highest value within each row (sampling method) of each sub-table is underlined