Scenario | System baseline | Batch size | Learning Rate |
---|---|---|---|
1 | BERT | 16 | 1.00E−05 |
2 | 16 | 3.00E−05 | |
3 | 16 | 1.00E−05 | |
4 | 16 | 3.00E−05 | |
5 | 32 | 1.00E−05 | |
6 | 32 | 3.00E−05 | |
7 | 32 | 1.00E−05 | |
8 | 32 | 3.00E−05 | |
9 | Roberta | 16 | 1.00E−05 |
10 | 16 | 3.00E−05 | |
11 | 16 | 1.00E−05 | |
12 | 16 | 3.00E−05 | |
13 | 32 | 1.00E−05 | |
14 | 32 | 3.00E−05 | |
15 | 32 | 1.00E−05 | |
16 | 32 | 3.00E−05 | |
17 | XL Net | 16 | 1.00E−05 |
18 | 16 | 3.00E−05 | |
19 | 16 | 1.00E−05 | |
20 | 16 | 3.00E−05 | |
21 | 32 | 1.00E−05 | |
22 | 32 | 3.00E−05 | |
23 | 32 | 1.00E−05 | |
24 | 32 | 3.00E−05 | |
25 | BERT + NLP Statistical Features | 16 | 1.00E−05 |
26 | 16 | 3.00E−05 | |
27 | 16 | 1.00E−05 | |
28 | 16 | 3.00E−05 | |
29 | 32 | 1.00E−05 | |
30 | 32 | 3.00E−05 | |
31 | 32 | 1.00E−05 | |
32 | 32 | 3.00E−05 | |
33 | Roberta + NLP Statistical Features | 16 | 1.00E−05 |
34 | 16 | 3.00E−05 | |
35 | 16 | 1.00E−05 | |
36 | 16 | 3.00E−05 | |
37 | 32 | 1.00E−05 | |
38 | 32 | 3.00E−05 | |
39 | 32 | 1.00E−05 | |
40 | 32 | 3.00E−05 | |
41 | XLNet + NLP Statistical Features | 16 | 1.00E−05 |
42 | 16 | 3.00E−05 | |
43 | 16 | 1.00E−05 | |
44 | 16 | 3.00E−05 | |
45 | 32 | 1.00E−05 | |
46 | 32 | 3.00E−05 | |
47 | 32 | 1.00E−05 | |
48 | 32 | 3.00E−05 | |
49 | Proposed Method (Model averaging (BERT + ROBERTA + XLNet)) + NLP Statistical Features | 16 | 1.00E−05 |
50 | 16 | 3.00E−05 | |
51 | 16 | 1.00E−05 | |
52 | 16 | 3.00E−05 | |
53 | 32 | 1.00E−05 | |
54 | 32 | 3.00E−05 | |
55 | 32 | 1.00E−05 | |
56 | 32 | 3.00E−05 |