Research on Chinese negation and speculation: corpus annotation and identification

Bowei ZOU; Guodong ZHOU; Qiaoming ZHU

doi:10.1007/s11704-015-5101-2

Front. Comput. Sci. ›› 2016, Vol. 10 ›› Issue (6) :1039 -1051. DOI: 10.1007/s11704-015-5101-2

RESEARCH ARTICLE

Research on Chinese negation and speculation: corpus annotation and identification

Author information +

History +

PDF (359KB)

Abstract

Identifying negative or speculative narrative fragments from facts is crucial for deep understanding on natural language processing (NLP). In this paper, we firstly construct a Chinese corpus which consists of three sub-corpora from different resources. We also present a general framework for Chinese negation and speculation identification. In our method, first, we propose a feature-based sequence labeling model to detect the negative or speculative cues. In addition, a cross-lingual cue expansion strategy is proposed to increase the coverage in cue detection. On this basis, this paper presents a new syntactic structure-based framework to identify the linguistic scope of a negative or speculative cue, instead of the traditional chunking-based framework. Experimental results justify the usefulness of our Chinese corpus and the appropriateness of our syntactic structure-based framework which has showed significant improvement over the state-of-the-art on Chinese negation and speculation identification.

Keywords

negation / speculation / cue detection / scope resolution / Chinese corpus

Cite this article

Download citation ▾

Bowei ZOU, Guodong ZHOU, Qiaoming ZHU. Research on Chinese negation and speculation: corpus annotation and identification. Front. Comput. Sci., 2016, 10 (6) : 1039-1051 DOI:10.1007/s11704-015-5101-2

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Morante R, Sporleder C. Modality and negation: an introduction to the special issue. Computational Linguistics, 2012, 38(2): 223–260

[2]	Friedman C, Alderson P O, Austin J H, Cimino J J, Johnson S B. A general natural–language text processor for clinical radiology. American Medical Informatics Association, 1994, 1(2): 161–174

[3]	Di Marco C,Kroon F W, Mercer R E. Using hedges to classify citations in scientific articles. The Information Retrieval Series, 2006, 20: 247–263

[4]	Morante R, Liekens A, Daelemans W. Learning the scope of negation in biomedical texts. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2008, 715–724

[5]

Chowdhury M F M, Lavelli A. Exploiting the scope of negations and heterogeneous features for relation extraction: a case study for drugdrug interaction extraction. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013, 765–771

[6]	Averbuch M, Karson T, Ben-Ami B, Maimon O, Rokach L. Contextsensitive medical information retrieval. In: Proceedings of the 11th World Congress on Medical Informatics. 2004, 1–8

[7]	Wilson T A. Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states. ProQuest, 2008

[8]	Councill I G, McDonald R, Velikovich L. What’s great and what’s not: learning to classify the scope of negation for improved sentiment analysis. In: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing. 2010, 51–59

[9]	Zhu X, Guo H, Mohammad S, Kiritchenko S. An empirical study on the effect of negation words on sentiment. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 304–313

[10]	Snow R, Vanderwende L, Menezes A. Effectively using syntax for recognizing false entailment. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006, 33–40

[11]	Baker K, Bloodgood M, Dorr B J, Filardo N W, Levin L, Piatko C. A modality lexicon and its use in automatic tagging. In: Proceedings of the 7th Conference on International Language Resources and Evaluation. 2010, 1402–1407

[12]	Wetzel D, Bond F. Enriching parallel corpora for statistical machine translation with semantic negation rephrasing. In: Proceedings of the 6th Workshop on Syntax, Semantics and Structure in Statistical Translation. 2012, 20–29

[13]	Özgür A, Radev D R. Detecting speculations and their scopes in scientific text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2009, 1398–1407

[14]	Øvrelid L, Velldal E, Oepen S. Syntactic scope resolution in uncertainty analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 1379–1387

[15]	Apostolova E, Tomuro N, Demner-Fushman D. Automatic extraction of lexico-syntactic patterns for detection of negation and speculation scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Short Papers. 2011, 283–287

[16]	Morante R, Daelemans W. A metalearning approach to processing the scope of negation. In: Proceedings of the 13th Conference on Computational Natural Language Learning. 2009, 28–36

[17]	Agarwal S, Yu H. Detecting hedge cues and their scope in biomedical text with conditional random fields. Biomedical Informatics, 2010, 43(6): 953–961

[18]	Sánchez L M, Li B, Vogel C. Exploiting CCG structures with tree kernels for speculation detection. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task. 2007, 126–131

[19]	Zou B, Zhou G, Zhu Q. Tree kernel-based negation and speculation scope detection with structured syntactic parse features. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2013, 968–976

[20]	Vincze V, Szarvas G, Farkas R, Móra G, Csirik J. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 2008, 9(11): 279–282

[21]	Ji F, Qiu X, Huang X. Exploring uncertainty sentences in Chinese. In: Proceedings of the 16th China Conference on Information Retrieval. 2010, 594–601

[22]	Chen Z, Zou B, Zhu Q, Li P. Chinese negation and speculation detection with conditional random fields. Communications in Computer and Information Science, 2013, 400: 30–40

[23]	Qian Z, Zou B, Li P, Zhu Q. The prediction method of rise or fall in stock markets based on the discrimination of information credibility. In: Proceedings of the 20th China Conference on Information Retrieval. 2014

[24]	Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. 2004, 271–278

[25]	Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, 20: 37–46

[26]	Liu L, Hong Y, Liu H, Wang X, Yao J. Effective selection of translation model training data. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Short Papers. 2014, 569–573

[27]	Och F J, Ney H. A systematic comparison of various statistical alignment models. Computational Linguistics, 2003, 29(1), 19–51

[28]	Jiang Z, Ng T. Semantic role labeling of NomBank: A maximum entropy approach. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2006, 138–145

[29]	Zhu Q, Li J, Wang H, Zhou G. A unified framework for scope learning via simplified shallow semantic parsing. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2010, 714–724

[30]	Farkas R, Vincze V, Móra G, Csirik J, Szarvas G. The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. Proceedings of the 14th Conference on Computational Natural Language Learning. 2010

[31]	Che W, Li Z, Liu T. LTP: a Chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. 2010, 13–16