From: Examining characteristics of predictive models with imbalanced big data
Dataset | Ratio | Positives | Negatives | Negatives %a | Total |
---|---|---|---|---|---|
ECBDL’14 | (90:10) | 687,729 | 6,189,561 | 19.772 | 6,877,290 |
(75:25) | 687,729 | 2,063,187 | 6.591 | 2,750,916 | |
(65:35) | 687,729 | 1,277,211 | 4.080 | 1,964,940 | |
(50:50) | 687,729 | 687,729 | 2.197 | 1,375,458 | |
(45:55) | 687,729 | 562,687 | 1.797 | 1,250,416 | |
(40:60) | 687,729 | 458,486 | 1.465 | 1,146,215 | |
POST | (90:10) | 2391 | 21,519 | 1.270 | 23,910 |
(75:25) | 2391 | 7173 | 0.423 | 9564 | |
(65:35) | 2391 | 4440 | 0.262 | 6831 | |
(50:50) | 2391 | 2391 | 0.141 | 4782 | |
(45:55) | 2391 | 1956 | 0.115 | 4347 | |
(40:60) | 2391 | 1594 | 0.094 | 3985 |