Data reduction techniques for highly imbalanced medicare Big Data

Journal of Big Data

Table 7 Mean AUPRC values by classifier and induced class ratio for ten iterations of five-fold cross validation, for part D scenario one

Ratio Classifier	1:1	1:3	1:9	1:27	1:81	1:1,429
CatBoost	0.6304	0.7047	0.7498	0.7662	0.7798	0.7793
ET	0.0937	0.1199	0.1635	0.2029	0.2401	0.3254
LightGBM	0.5786	0.6588	0.7025	0.7183	0.6783	0.5132
Logistic regression	0.1189	0.1701	0.2173	0.2455	0.2700	0.3060
Random forest	0.1784	0.2208	0.2212	0.2240	0.2199	0.2469
XGBoost	0.6065	0.6768	0.7164	0.7377	0.7351	0.7372