From: Survey of review spam detection using machine learning techniques
Paper | Dataset | Features used | Learner | Performance metric | Score | Method complexity |
---|---|---|---|---|---|---|
[20] | 5.8 million reviews written by 2.14 reviewers crawled from amazon website | Review and reviewer features | LR | AUC | 78Â % | Low |
[21] | 5.8 million reviews written by 2.14 reviewers crawled from amazon website | Features of the review, reviewer and product characteristics | LR | AUC | 78Â % | Medium |
[21] | 5.8 million reviews written by 2.14 reviewers crawled from amazon website | Text features | LR | AUC | 63Â % | Low |
[9] | 6000 reviews from Epinions | Review and reviewer features | NB with Co-training | F-Score | 0.631 | High |
[3] | Hotels through Amazon Mechanical Turk (AMT) by Ott et al. | Bigrams | SVM | Accuracy | 89.6Â % | Low |
[3] | Hotels through Amazon Mechanical Turk (AMT) by Ott et al. | LIWC + Bigrams | SVM | Accuracy | 89.8 % | Medium |
[25] | Hotels through Amazon Mechanical Turk (AMT) by Ott et al. + gathered 400 deceptive hotel and doctor reviews from domain experts | LIWC + POS + Unigram | SAGE | Accuracy | 65 % | High |
[23] | Yelp’s real-life data | Behavioral features combined with the bigram features | SVM | Accuracy | 86.1 % | Medium |
[11] | Hotels through Amazon Mechanical Turk (AMT) by Ott et al. | Stylometric features | SVM | F-measure | 84Â % | Low |
[12] | Hotels through Amazon Mechanical Turk (AMT) by Ott et al. | n-gram features | SVM | Accuracy | 86Â % | Low |
[1] | Dataset collected from amazon.com | Syntactical, lexical, and stylistic features | SLM | AUC | .9986 | High |
[24] | Their own crawled Arabic reviews from tripadvisor.com, booking.com, and agoda.ae | Review and reviewer features | NB | F-measure | .9959 | Low |