From: Survey of transformers and towards ensemble learning using transformers for natural language processing
Model
BERT
XLNet
RoBERTa
ALBERT
Ensemble learning
Acc
0.938
0.951
0.886
0.894
0.964