DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors

Wen Zhang , Qiang Wang , Xiangjun Li , Taketoshi Yoshida , Jian Li

Journal of Systems Science and Systems Engineering ›› 2019, Vol. 28 ›› Issue (6) : 731 -746.

PDF
Journal of Systems Science and Systems Engineering ›› 2019, Vol. 28 ›› Issue (6) : 731 -746. DOI: 10.1007/s11518-019-5438-4
Article

DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors

Author information +
History +
PDF

Abstract

Due to the anonymous and free-for-all characteristics of online forums, it is very hard for human beings to differentiate deceptive reviews from truthful reviews. This paper proposes a deep learning approach for text representation called DCWord (Deep Context representation by Word vectors) to deceptive review identification. The basic idea is that since deceptive reviews and truthful reviews are composed by writers without and with real experience on using the online purchased goods or services, there should be different contextual information of words between them. Unlike state-of-the-art techniques in seeking best linguistic features for representation, we use word vectors to characterize contextual information of words in deceptive and truthful reviews automatically. The average-pooling strategy (called DCWord-A) and max-pooling strategy (called DCWord-M) are used to produce review vectors from word vectors. Experimental results on the Spam dataset and the Deception dataset demonstrate that the DCWord-M representation with LR (Logistic Regression) produces the best performances and outperforms state-of-the-art techniques on deceptive review identification. Moreover, the DCWord-M strategy outperforms the DCWord-A strategy in review representation for deceptive review identification. The outcome of this study provides potential implications for online review management and business intelligence of deceptive review identification.

Keywords

Online business intelligence / skip-gram model / DCWord representation / deceptive review identification / deep learning

Cite this article

Download citation ▾
Wen Zhang, Qiang Wang, Xiangjun Li, Taketoshi Yoshida, Jian Li. DCWord: A Novel Deep Learning Approach to Deceptive Review Identification by Word Vectors. Journal of Systems Science and Systems Engineering, 2019, 28(6): 731-746 DOI:10.1007/s11518-019-5438-4

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Cao L, Tang X. Topics and trends of the online public concerns based on Tianya forum. Journal of Systems Science and Systems Engineering, 2014, 23(2): 212-230.

[2]

Chatterjeei P. Online reviews. Do consumers use them?. Proceedings of Conference on Association for Consumer Research, 2001 129-134.

[3]

Chen J, Zhou X, Tang X. An empirical feasibility study of societal risk classification toward BBS posts. Journal of Systems Science and Systems Engineering, 2018, 27(6): 709-726.

[4]

Chen L, Wang F. Preference-based clustering reviews for augmenting e-commerce recommendation. Knowledge-Based Systems, 2013, 50(3): 44-59.

[5]

Ciresan D C, Meier U, Masci J, Gambardella L M, Schmidhuber. Flexible, high performance convolutional neural networks for image classification. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, 2011 1237-1242.

[6]

Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning. Journal of Parallel & Distributed Computing, 2008 160-167.

[7]

Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 2011, 12(1): 2493-2537.

[8]

Feng S, Banerjee R, Choi Y. Syntactic stylometry for deception detection. ACL, 2012 8-14.

[9]

Feng V W, Hirst G. Detecting deceptive opinions with profile compatibility. International Joint Conference on Natural Language Processing, 2013 14-18.

[10]

Firth J R. A synopsis of linguistic theory 1930–1955. Studies in Linguistic Analysis. Philological Society, 1957, 40(2): 305-321.

[11]

Gokhman S, Hancock J, Prabhu P, Ott M, Cardie C. In search of a gold standard in studies of deception. Proceedings of the EACL 2012 Workshop on Computational Approaches to Deception Detection, 2012 23-27.

[12]

Guo C, Du Z, Kou X. Products ranking through aspect-based sentiment analysis of online heterogeneous reviews. Journal of Systems Science and Systems Engineering, 2018, 27(5): 542-558.

[13]

Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507.

[14]

Jindal N, Liu B. Opinion spam and analysis. International Conference on Web Search and Data Mining, 2008

[15]

Kietzmann J, Canhoto A. Bittersweet! Understanding and managing electronic word of mouth. Journal of Public Affairs, 2013, 13(2): 146-159.

[16]

Klein D, Manning CD. Accurate unlexicalized parsing. Meeting on Association for Computational Linguistics, 2003 423-430.

[17]

Lai S, Xu L, Liu K, Zhao J. Recurrent convolutional neural network for text classification. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015 2267-2273.

[18]

Li F, Huang M, Yang Y, Zhu X. Learning to identify review spam. International Joint Conference on Artificial Intelligence, 2011 2488-2493.

[19]

Li J, Ott M, Cardie C, Hovy E. Towards a general rule for identifying deceptive opinion spam. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014 1566-1576.

[20]

Lim Y J, Osman A, Salahuddin S N, Romle A R, Abdullah S. Factors influencing online shopping behavior: The mediating role of purchase intention. Procedia Economics and Finance, 2016, 35: 401-410.

[21]

Liu B. Opinion spam detection: Detecting fake reviews and reviewers, 2012

[22]

Liu Q, Gao Z, Liu B, Zhang Y. A logic programming approach to aspect extraction in opinion mining. Ieee/wic/acm International Joint Conferences on Web Intelligence, 2013, 1: 276-283.

[23]

Marrese-Taylor E, Velásquez J D, Bravo-Marquez F, Matsuo Y. Identifying customer preferences about tourism products using an aspect-based opinion mining approach. Procedia Computer Science, 2013, 22: 182-191.

[24]

Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. Computer Science, 2013 1301

[25]

Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 2013, 26: 3111-3119.

[26]

Mudambi S M, Schuff D. What makes a helpful online review? A study of customer reviews on Amazon.com. MIS Quarterly, 2010, 34(1): 185-200.

[27]

Nitin I, Fred J D, Zhang T. Text mining: Predictive methods for analyzing unstructured information. Springer Science and Business Media, 2005 15-37.

[28]

Ott M, Choi Y, Cardie C, Hancock J T. Finding deceptive opinion spam by any stretch of the imagination. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 2011 19-24.

[29]

Pannakkong W, Sriboonchitta S, Huynh V. An ensemble model of arima and ann with restricted boltzmann machine based on decomposition of discrete wavelet transform for time series forecasting. Journal of Systems Science and Systems Engineering, 2018, 27(5): 690-708.

[30]

Ren Y, Ji D. Neural networks for deceptive opinion spam detection: An empirical study. Information Sciences, 2017, 385: 213-224.

[31]

Ren Y, Zhang Y. Deceptive opinion spam detection using neural network. Proceedings of the 26th International Conference on Computational Linguistics, 2016 140-150.

[32]

Socher R, Lin CY, Ng AY, Manning CD. Parsing natural scenes and natural language with recursive neural networks. International Conference on Machine Learning, 2011 129-136.

[33]

Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 2010, 11(12): 3371-3408.

[34]

Zhang W, Yoshida T, Tang X. Text classification toward a scientific forum. Journal of Systems Science and Systems Engineering, 2007, 16(3): 356-379.

[35]

Zhang W, Yoshida T, Tang X. Text classification based on multi-word with support vector machine. Knowledge-Based Systems, 2008, 21(8): 879-886.

[36]

Zhang W, Yoshida T, Tang X, Ho T. Improving effectiveness of mutual information substantival multiword expression extraction. Expert Systems with Application, 2009, 36(8): 10919-10930.

[37]

Zhou L, Shi Y, Zhang D. A statistical language modeling approach to online deception detection. IEEE Transactions on Knowledge & Data Engineering, 2008, 20(8): 1077-1081.

AI Summary AI Mindmap
PDF

221

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/