Skip to main content

Table 3 Evaluation of classification model on Jakarta Traffic Tweet Corpus

From: Traffic and road conditions monitoring system using extracted information from Twitter

  NB SVM LR RF
Relevance prediction
Count vector, unigram 87.44% 89.20% 90.55% 88.88%
TF-IDF, unigram 86.54% 90.00% 89.73% 88.96%
TF-IDF, bigram 79.32% 88.34% 88.17% 87.45%
TF-IDF, trigram 62.63% 82.52% 82.73% 82.21%
TF-IDF, char {1,2} gram 82.46% 90.40% 90.00% 86.88%
TF-IDF, bigram + trigram 85.81% 90.23% 90.04% 89.30%
Unigram + char gram 87.35% 89.65% 90.72% 87.49%
Traffic event type prediction
Count vector, unigram 82.07% 93.79% 94.04% 90.10%
TF-IDF, unigram 81.51% 93.93% 92.35% 89.93%
TF-IDF, bigram 84.72% 89.60% 89.10% 87.64%
TF-IDF, trigram 77.46% 79.17% 79.26% 77.80%
TF-IDF, char {1,2}gram 76.30% 92.66% 91.41% 90.06%
TF-IDF, bigram + trigram 83.51% 93.93% 92.68% 91.81%
Unigram + char gram 81.72% 93.85% 94.14% 89.99%