Data reduction techniques for highly imbalanced medicare Big Data

Journal of Big Data

Table 15 Mean AUPRC values by classifier and number of features for ten iterations of five-fold cross validation, for part B scenario four

Features classifier	10	15	20	25	30	80
CatBoost	0.6787	0.6725	0.6994	0.6978	0.6975	0.6812
ET	0.0289	0.0352	0.0495	0.0512	0.0462	0.0336
LightGBM	0.5968	0.5803	0.6063	0.5935	0.5938	0.5766
Logistic regression	0.0078	0.0065	0.0067	0.0069	0.0090	0.0099
Random forest	0.3313	0.3036	0.3161	0.3120	0.2892	0.2017
XGBoost	0.6560	0.6406	0.6644	0.6630	0.6662	0.6536