Skip to main content

Table 4 Performances’ summary of algorithm-level methods

From: A survey on addressing high-class imbalance in big data

Technique GMa TP * TNb AUCc Ad AGe Ff FAEg W h TPi GDj BDFk
Cost-sensitive
 Lopez et al. [9] Apache Hadoop
  Chi-FRBCS-Big DataCS 0.99
 Wang et al. [11]
  CS-SSOL 0.99
Hybrid/ensemble
 Marchant and Rubinstein [58]
  OASIS 10−5
 Maurya [13]
  IBO 0.87
 Veeramachaneni et al. [60]
  AI2 0.85
 Galpert et al. [14] Apache Hadoop and Apache Spark
  ROS + SVM-BD 0.88 0.89
 Wei et al. [64]
  i-Alertor 0.66
 D’Addabbo and Maglietta [67]
  PSS-SVM 0.99
 Triguero et al. [3] Apache Hadoop
  ROSEFW-RF 0.53
 Zhai et al. [70] Apache Hadoop
  ELM ensemble 0.97
 Hebert [72]
  RF 0.15
  XGBoost 0.05
 Rio et al. [46] Apache Hadoop
  ROS 0.99
  RUS 0.98
  SMOTE 0.91
  RF 0.97
 Baughman et al. [74]
  DeepQA 0.28
  1. aGeometric mean
  2. bTrue positive rate * true negative rate
  3. cArea under the ROC curve
  4. dAccuracy
  5. eAccuracy gain
  6. fF-measure
  7. gF-measure absolute error
  8. hPositive datapoints weight
  9. iTrue positive rate
  10. jGini index mean decrease
  11. kBig Data framework