Skip to main content

Table 3 Results of experiments for different methods, representations and data sets with continuous vector encodings

From: Evaluation of different machine learning approaches and input text representations for multilingual classification of tweets for disease surveillance in the social web

Methods Measure Datasets Aggregated Scores
English (Validation dataset) French German Spanish Arabic Japanese un-weighted average performance Performance variance Overall performance
CNN + word2vec Precision 0.76 0.55 0.53 0.78 0.4 0.48 0.55 1.60E02 0.69
Recall 0.73 0.84 0.77 0.9 0.6 0.51 0.72 2.10E02 0.81
F1 Score 0.78 0.67 0.63 0.84 0.51 0.5 0.63 1.50E02 0.75
CNF LSTM stack Precision 0.73 0.55 0.51 0.76 0.49 0.33 0.53 0.02 0.67
Recall 0.71 0.76 0.62 0.87 0.67 0.46 0.68 0.02 0.77
F1 Score 0.89 0.64 0.56 0.81 0.56 0.38 0.59 0.02 0.71
CNN CNF Precision 0.72 0.48 0.46 0.73 0.36 0.29 0.46 0.03 0.61
Recall 0.57 0.86 0.87 0.94 0.76 0.7 0.83 0.01 0.89
F1 Score 0.61 0.61 0.6 0.82 0.49 0.41 0.59 0.02 0.71
Bi LSTM CNF Precision 0.81 0.57 0.54 0.76 0.5 0.33 0.54 0.02 0.68
Recall 0.74 0.72 0.67 0.77 0.63 0.51 0.66 0.01 0.78
F1 Score 0.89 0.64 0.6 0.76 0.56 0.4 0.59 0.02 0.72
CNN-LSTM CNF Precision 0.75 0.55 0.47 0.79 0.37 0.37 0.51 0.03 0.64
Recall 0.75 0.83 0.65 0.8 0.61 0.61 0.70 0.01 0.81
F1 Score 0.75 0.67 0.55 0.79 0.46 0.46 0.59 0.02 0.72
m-BERT-uncased- CNN Precision 0.65 0.34 0.34 0.66 0.37 0.29 0.44 0.02 0.59
Recall 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 1.00
F1 Score 0.79 0.51 0.51 0.79 0.54 0.45 0.60 0.02 0.73