From: Survey of transformers and towards ensemble learning using transformers for natural language processing
Model
BERT
GPT2
XLNet
RoBERTa
ALBERT
Ensemble learning
Acc
0.764
0.647
0.688
0.717
0.766
0.787
F1
0.774
0.660
0.703
0.729
0.775
0.795
P
0.779
0.676
0.704
0.733
0.788
0.811
R
0.773
0.652
0.706
0.728
0.767
0.786
AUC
0.942
0.859
0.918
0.927
0.935
0.953