Skip to main content

Table 9 Comparison of previous works and results for review spam detection along with the relative complexity of the approach (including feature extraction and learning methodology)

From: Survey of review spam detection using machine learning techniques

Paper

Dataset

Features used

Learner

Performance metric

Score

Method complexity

[20]

5.8 million reviews written by 2.14 reviewers crawled from amazon website

Review and reviewer features

LR

AUC

78 %

Low

[21]

5.8 million reviews written by 2.14 reviewers crawled from amazon website

Features of the review, reviewer and product characteristics

LR

AUC

78 %

Medium

[21]

5.8 million reviews written by 2.14 reviewers crawled from amazon website

Text features

LR

AUC

63 %

Low

[9]

6000 reviews from Epinions

Review and reviewer features

NB with Co-training

F-Score

0.631

High

[3]

Hotels through Amazon Mechanical Turk (AMT) by Ott et al.

Bigrams

SVM

Accuracy

89.6 %

Low

[3]

Hotels through Amazon Mechanical Turk (AMT) by Ott et al.

LIWC + Bigrams

SVM

Accuracy

89.8 %

Medium

[25]

Hotels through Amazon Mechanical Turk (AMT) by Ott et al. + gathered 400 deceptive hotel and doctor reviews from domain experts

LIWC + POS + Unigram

SAGE

Accuracy

65 %

High

[23]

Yelp’s real-life data

Behavioral features combined with the bigram features

SVM

Accuracy

86.1 %

Medium

[11]

Hotels through Amazon Mechanical Turk (AMT) by Ott et al.

Stylometric features

SVM

F-measure

84 %

Low

[12]

Hotels through Amazon Mechanical Turk (AMT) by Ott et al.

n-gram features

SVM

Accuracy

86 %

Low

[1]

Dataset collected from amazon.com

Syntactical, lexical, and stylistic features

SLM

AUC

.9986

High

[24]

Their own crawled Arabic reviews from tripadvisor.com, booking.com, and agoda.ae

Review and reviewer features

NB

F-measure

.9959

Low