Data reduction techniques for highly imbalanced medicare Big Data

Journal of Big Data

Table 12 Mean AUPRC values by classifier and induced class ratio for ten iterations of five-fold cross validation, for part B scenario one

Ratio classifier	1:1	1:3	1:9	1:27	1:81	1:2,500
CatBoost	0.4428	0.5320	0.6228	0.6569	0.6812	0.6817
ET	0.0125	0.0135	0.0184	0.0272	0.0336	0.0433
LightGBM	0.4001	0.4859	0.5563	0.5967	0.5766	0.4146
Logistic regression	0.0058	0.0069	0.0076	0.0086	0.0099	0.0103
Random forest	0.0791	0.1210	0.1596	0.1829	0.2017	0.2462
XGBoost	0.4240	0.5104	0.5783	0.6234	0.6536	0.6886