Skip to main content

Table 3 Results of experiments for different methods, representations and data sets with continuous vector encodings

From: Evaluation of different machine learning approaches and input text representations for multilingual classification of tweets for disease surveillance in the social web

Methods

Measure

Datasets

Aggregated Scores

English (Validation dataset)

French

German

Spanish

Arabic

Japanese

un-weighted average performance

Performance variance

Overall performance

CNN + word2vec

Precision

0.76

0.55

0.53

0.78

0.4

0.48

0.55

1.60E−02

0.69

Recall

0.73

0.84

0.77

0.9

0.6

0.51

0.72

2.10E−02

0.81

F1 Score

0.78

0.67

0.63

0.84

0.51

0.5

0.63

1.50E−02

0.75

CNF LSTM stack

Precision

0.73

0.55

0.51

0.76

0.49

0.33

0.53

0.02

0.67

Recall

0.71

0.76

0.62

0.87

0.67

0.46

0.68

0.02

0.77

F1 Score

0.89

0.64

0.56

0.81

0.56

0.38

0.59

0.02

0.71

CNN CNF

Precision

0.72

0.48

0.46

0.73

0.36

0.29

0.46

0.03

0.61

Recall

0.57

0.86

0.87

0.94

0.76

0.7

0.83

0.01

0.89

F1 Score

0.61

0.61

0.6

0.82

0.49

0.41

0.59

0.02

0.71

Bi LSTM CNF

Precision

0.81

0.57

0.54

0.76

0.5

0.33

0.54

0.02

0.68

Recall

0.74

0.72

0.67

0.77

0.63

0.51

0.66

0.01

0.78

F1 Score

0.89

0.64

0.6

0.76

0.56

0.4

0.59

0.02

0.72

CNN-LSTM CNF

Precision

0.75

0.55

0.47

0.79

0.37

0.37

0.51

0.03

0.64

Recall

0.75

0.83

0.65

0.8

0.61

0.61

0.70

0.01

0.81

F1 Score

0.75

0.67

0.55

0.79

0.46

0.46

0.59

0.02

0.72

m-BERT-uncased- CNN

Precision

0.65

0.34

0.34

0.66

0.37

0.29

0.44

0.02

0.59

Recall

1.00

1.00

1.00

1.00

1.00

1.00

1.00

0.00

1.00

F1 Score

0.79

0.51

0.51

0.79

0.54

0.45

0.60

0.02

0.73