Skip to main content

Table 10 Summary of feature extraction and representation

From: Real-time event detection in social media streams through semantic analysis of noisy terms

S/N

Dataset

Unigram

Bigram

1

Twitter sentiment analysis training corpus

76,522

Top-K word (50,000)

501,026

Top-K (150,000)

2

Naija-Tweets

3,296

Top-K (3,000)

10,187

Top-K (8,000)