Advanced machine learning techniques for cardiovascular disease early detection and diagnosis

Journal of Big Data

Table 8 The results of hyper-parameter optimization of Machine learning models

Model	Parameters	Best parameters	Accuracy	AUC
Extra Trees	n_estimators: [100, 105, ..., 500], criterion :(’gini’, ’entropy’), max_depth: [5, 10, 15, 20], min_samples_split: [2, 4, 6], min_samples_leaf: [4, 5, 6]	criterion=’entropy’, max_depth=15, min_samples_leaf=4, n_estimators=300	84.54%	0.920
Random Forest	n_estimators: [100, 105, ..., 500], criterion :(’gini’, ’entropy’), max_depth: [3, 7, 14, 21], min_samples_split: [2, 5, 10], min_samples_leaf: [3, 5, 7], max_features: [None, ’sqrt’], max_leaf_nodes: [None, 5, 10, 15, 20], min_impurity_decrease’: [0.001, 0.01, 0.05, 0.1], bootstrap: [True, False]	max_depth=14, max_features=’sqrt’, max_leaf_nodes=15, min_impurity_decrease=0.001, min_samples_leaf=3, min_samples_split=10, n_estimators=200	85.52%	0.924
AdaBoost	n_estimators: [100, 105, ..., 500], learning_rate: [0.25, 0.5, 0.75, 0.9]	learning_rate=0.25, n_estimators=100	84.06%	0.897
Gradient Boosting	boosting_type: [’gbdt’, ’dart’], num_leaves: [20, 27, 34, ...,50], max_depth : [-1, 3, 7, 14, 21], learning_rate: [0.0001, 0.001, 0.01, 0.1, 0.5, 1], n_estimators’: [100, 105, ..., 500], min_split_gain: [0.00001, 0.0001, 0.001, 0.01, 0.1], min_child_samples: [3, 5, 7], subsample: [0.5, 0.8, 0.95], colsample_bytree: [0.6, 0.75, 1]	boosting_type=’dart’, colsample_bytree=1, learning_rate=0.5, max_depth=3, min_child_samples=7, min_split_gain=1e-05, num_leaves=30, subsample=0.5	88.9%	0.925