Skip to main content

Advertisement

Table 9 Comparison of previous works and results for review spam detection along with the relative complexity of the approach (including feature extraction and learning methodology)

From: Survey of review spam detection using machine learning techniques

Paper Dataset Features used Learner Performance metric Score Method complexity
[20] 5.8 million reviews written by 2.14 reviewers crawled from amazon website Review and reviewer features LR AUC 78 % Low
[21] 5.8 million reviews written by 2.14 reviewers crawled from amazon website Features of the review, reviewer and product characteristics LR AUC 78 % Medium
[21] 5.8 million reviews written by 2.14 reviewers crawled from amazon website Text features LR AUC 63 % Low
[9] 6000 reviews from Epinions Review and reviewer features NB with Co-training F-Score 0.631 High
[3] Hotels through Amazon Mechanical Turk (AMT) by Ott et al. Bigrams SVM Accuracy 89.6 % Low
[3] Hotels through Amazon Mechanical Turk (AMT) by Ott et al. LIWC + Bigrams SVM Accuracy 89.8 % Medium
[25] Hotels through Amazon Mechanical Turk (AMT) by Ott et al. + gathered 400 deceptive hotel and doctor reviews from domain experts LIWC + POS + Unigram SAGE Accuracy 65 % High
[23] Yelp’s real-life data Behavioral features combined with the bigram features SVM Accuracy 86.1 % Medium
[11] Hotels through Amazon Mechanical Turk (AMT) by Ott et al. Stylometric features SVM F-measure 84 % Low
[12] Hotels through Amazon Mechanical Turk (AMT) by Ott et al. n-gram features SVM Accuracy 86 % Low
[1] Dataset collected from amazon.com Syntactical, lexical, and stylistic features SLM AUC .9986 High
[24] Their own crawled Arabic reviews from tripadvisor.com, booking.com, and agoda.ae Review and reviewer features NB F-measure .9959 Low