Skip to main content

Table 6 Classification performance evaluated on SKIN

From: Multi-sample \(\zeta \)-mixup: richer, more realistic synthetic samples from a p-series interpolant

Dataset

ISIC 2016\(^{\dagger }\)

ISIC 2017\(^{\dagger }\)

#images (#classes)

1279 (2)

2750 (3)

Method

ResNet-18

ResNet-50

ResNet-18

ResNet-50

 

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ERM

70.44%

0.7836

0.6865

71.75%

0.8127

0.7121

69.31%

0.7383

0.6720

68.20%

0.6867

0.6361

mixup  

71.77%

0.7968

0.7017

72.08%

0.8179

0.7175

71.60%

0.7333

0.6756

71.51%

0.7433

0.6979

\(\zeta \)-mixup  (2.4)

74.53%

0.8417

0.7180

71.52%

0.8654

0.7492

73.02%

0.7483

0.6965

72.91%

0.7783

0.7099

\(\zeta \)-mixup  (2.8)

73.03%

0.8654

0.7588

72.20%

0.8602

0.7493

72.33%

0.7633

0.7068

69.99%

0.7733

0.7028

\(\zeta \)-mixup  (4.0)

72.27%

0.7968

0.7043

72.11%

0.8391

0.7151

70.93%

0.7567

0.6815

72.39%

0.7517

0.6963

Dataset

ISIC 2018\(^{\dagger }\)

MSK\(^{\dagger }\)

#images (#classes)

10,015 (5)

3551 (4)

Method

ResNet-18

ResNet-50

ResNet-18

ResNet-50

 

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ERM

84.31%

0.8756

0.8122

81.28%

0.8653

0.7982

62.35%

0.6986

0.5999

63.86%

0.7873

0.6586

mixup  

83.96%

0.8394

0.7767

85.65%

0.8601

0.8064

63.59%

0.7423

0.6404

65.62%

0.7958

0.6434

\(\zeta \)-mixup  (2.4)

87.20%

0.8964

0.8441

84.75%

0.8653

0.8112

65.52%

0.7746

0.6475

65.23%

0.8056

0.6875

\(\zeta \)-mixup  (2.8)

84.67%

0.8756

0.8066

86.59%

0.9016

0.8333

64.87%

0.7845

0.6883

65.94%

0.7930

0.6704

\(\zeta \)-mixup  (4.0)

83.63%

0.8808

0.8062

89.18%

0.9223

0.8718

62.39%

0.6930

0.6006

65.33%

0.7817

0.6587

Dataset

UDA\(^{\dagger }\)

DermoFit\(^{\ddagger }\)

#images (#classes)

601 (2)

1,300 (5)

Method

ResNet-18

ResNet-50

ResNet-18

ResNet-50

 

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ERM

67.46%

0.7000

0.6666

66.85%

0.6917

0.6593

80.43%

0.8269

0.8120

83.24%

0.8500

0.8316

mixup  

69.38%

0.7167

0.6851

67.27%

0.7167

0.6727

81.17%

0.8577

0.8302

84.37%

0.8500

0.8406

\(\zeta \)-mixup  (2.4)

70.54%

0.8000

0.7272

68.39%

0.7417

0.6900

82.57%

0.8692

0.8419

86.26%

0.8615

0.8491

\(\zeta \)-mixup  (2.8)

70.22%

0.7667

0.7127

70.92%

0.7667

0.7176

83.50%

0.8731

0.8459

85.91%

0.8962

0.8765

\(\zeta \)-mixup  (4.0)

67.88%

0.7250

0.6800

67.59%

0.7500

0.6865

83.94%

0.8769

0.8514

88.16%

0.9115

0.9008

Dataset

derm7point: Clinical\(^{\ddagger }\)

derm7point: Dermoscopic\(^{\dagger }\)

#images (#classes)

1,011 (5)

1,011 (5)

Method

ResNet-18

ResNet-50

ResNet-18

ResNet-50

 

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ERM

42.08%

0.5297

0.3797

42.15%

0.6485

0.4328

54.79%

0.7030

0.5670

55.46%

0.7574

0.5819

mixup  

46.68%

0.5941

0.4392

45.57%

0.6485

0.4474

55.38%

0.7376

0.5683

62.08%

0.7772

0.6419

\(\zeta \)-mixup  (2.4)

47.82%

0.6782

0.4833

46.63%

0.6436

0.4239

55.88%

0.7525

0.5914

64.59%

0.7376

0.6406

\(\zeta \)-mixup  (2.8)

48.91%

0.6089

0.4496

48.36%

0.6733

0.5122

56.41%

0.7574

0.5700

62.98%

0.7624

0.6552

\(\zeta \)-mixup  (4.0)

46.93%

0.7030

0.4902

45.95%

0.6881

0.4828

55.45%

0.7178

0.5618

62.58%

0.7772

0.6622

Dataset

PH2\(^{\dagger }\)

MED-NODE\(^{\ddagger }\)

#images (#classes)

200 (2)

170 (2)

Method

ResNet-18

ResNet-50

ResNet-18

ResNet-50

 

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ACC\(_{\textrm{bal}}\)

F1-micro

F1-macro

ERM

84.38%

0.8000

0.8438

84.38%

0.9000

0.8438

75.00%

0.7941

0.7589

74.64%

0.7647

0.7509

mixup  

85.94%

0.9250

0.8769

85.94%

0.8500

0.8000

80.36%

0.7941

0.7925

81.79%

0.8235

0.8179

\(\zeta \)-mixup  (2.4)

85.94%

0.9250

0.8769

87.50%

0.9500

0.9134

79.29%

0.7941

0.7986

80.71%

0.8235

0.8132

\(\zeta \)-mixup  (2.8)

96.88%

0.9500

0.9283

87.50%

0.9500

0.9134

82.86%

0.8235

0.8211

81.79%

0.8235

0.8179

\(\zeta \)-mixup  (4.0)

85.94%

0.9250

0.8769

87.50%

0.9500

0.9134

81.79%

0.8235

0.8179

80.71%

0.8235

0.8132

  1. \(^{\dagger }\) and \(^{\ddagger }\) denote dermoscopic and clinical skin lesion images respectively. The evaluation metrics are balanced accuracy (‘ACC\(_{\textrm{bal}}\)’), micro-averaged F1 score (‘F1-micro’), and macro-averaged F1 score (‘F1-macro’). Higher values are better for all the metrics. The highest and the second highest values of each metric have been formatted with bold and underline respectively