Research on Chinese negation and speculation: corpus annotation and identification

Bowei ZOU, Guodong ZHOU, Qiaoming ZHU

PDF(547 KB)
PDF(547 KB)
Front. Comput. Sci. ›› 2016, Vol. 10 ›› Issue (6) : 1039-1051. DOI: 10.1007/s11704-015-5101-2
RESEARCH ARTICLE

Research on Chinese negation and speculation: corpus annotation and identification

Author information +
History +

Abstract

Identifying negative or speculative narrative fragments from facts is crucial for deep understanding on natural language processing (NLP). In this paper, we firstly construct a Chinese corpus which consists of three sub-corpora from different resources. We also present a general framework for Chinese negation and speculation identification. In our method, first, we propose a feature-based sequence labeling model to detect the negative or speculative cues. In addition, a cross-lingual cue expansion strategy is proposed to increase the coverage in cue detection. On this basis, this paper presents a new syntactic structure-based framework to identify the linguistic scope of a negative or speculative cue, instead of the traditional chunking-based framework. Experimental results justify the usefulness of our Chinese corpus and the appropriateness of our syntactic structure-based framework which has showed significant improvement over the state-of-the-art on Chinese negation and speculation identification.

Keywords

negation / speculation / cue detection / scope resolution / Chinese corpus

Cite this article

Download citation ▾
Bowei ZOU, Guodong ZHOU, Qiaoming ZHU. Research on Chinese negation and speculation: corpus annotation and identification. Front. Comput. Sci., 2016, 10(6): 1039‒1051 https://doi.org/10.1007/s11704-015-5101-2

References

[1]
Morante R, Sporleder C. Modality and negation: an introduction to the special issue. Computational Linguistics, 2012, 38(2): 223–260
CrossRef Google scholar
[2]
Friedman C, Alderson P O, Austin J H, Cimino J J, Johnson S B. A general natural–language text processor for clinical radiology. American Medical Informatics Association, 1994, 1(2): 161–174
CrossRef Google scholar
[3]
Di Marco C,Kroon F W, Mercer R E. Using hedges to classify citations in scientific articles. The Information Retrieval Series, 2006, 20: 247–263
CrossRef Google scholar
[4]
Morante R, Liekens A, Daelemans W. Learning the scope of negation in biomedical texts. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2008, 715–724
CrossRef Google scholar
[5]
Chowdhury M F M, Lavelli A. Exploiting the scope of negations and heterogeneous features for relation extraction: a case study for drugdrug interaction extraction. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013, 765–771
[6]
Averbuch M, Karson T, Ben-Ami B, Maimon O, Rokach L. Contextsensitive medical information retrieval. In: Proceedings of the 11th World Congress on Medical Informatics. 2004, 1–8
[7]
Wilson T A. Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states. ProQuest, 2008
[8]
Councill I G, McDonald R, Velikovich L. What’s great and what’s not: learning to classify the scope of negation for improved sentiment analysis. In: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing. 2010, 51–59
[9]
Zhu X, Guo H, Mohammad S, Kiritchenko S. An empirical study on the effect of negation words on sentiment. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 304–313
CrossRef Google scholar
[10]
Snow R, Vanderwende L, Menezes A. Effectively using syntax for recognizing false entailment. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006, 33–40
CrossRef Google scholar
[11]
Baker K, Bloodgood M, Dorr B J, Filardo N W, Levin L, Piatko C. A modality lexicon and its use in automatic tagging. In: Proceedings of the 7th Conference on International Language Resources and Evaluation. 2010, 1402–1407
[12]
Wetzel D, Bond F. Enriching parallel corpora for statistical machine translation with semantic negation rephrasing. In: Proceedings of the 6th Workshop on Syntax, Semantics and Structure in Statistical Translation. 2012, 20–29
[13]
Özgür A, Radev D R. Detecting speculations and their scopes in scientific text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2009, 1398–1407
CrossRef Google scholar
[14]
Øvrelid L, Velldal E, Oepen S. Syntactic scope resolution in uncertainty analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 1379–1387
[15]
Apostolova E, Tomuro N, Demner-Fushman D. Automatic extraction of lexico-syntactic patterns for detection of negation and speculation scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Short Papers. 2011, 283–287
[16]
Morante R, Daelemans W. A metalearning approach to processing the scope of negation. In: Proceedings of the 13th Conference on Computational Natural Language Learning. 2009, 28–36
CrossRef Google scholar
[17]
Agarwal S, Yu H. Detecting hedge cues and their scope in biomedical text with conditional random fields. Biomedical Informatics, 2010, 43(6): 953–961
CrossRef Google scholar
[18]
Sánchez L M, Li B, Vogel C. Exploiting CCG structures with tree kernels for speculation detection. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task. 2007, 126–131
[19]
Zou B, Zhou G, Zhu Q. Tree kernel-based negation and speculation scope detection with structured syntactic parse features. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2013, 968–976
[20]
Vincze V, Szarvas G, Farkas R, Móra G, Csirik J. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 2008, 9(11): 279–282
CrossRef Google scholar
[21]
Ji F, Qiu X, Huang X. Exploring uncertainty sentences in Chinese. In: Proceedings of the 16th China Conference on Information Retrieval. 2010, 594–601
[22]
Chen Z, Zou B, Zhu Q, Li P. Chinese negation and speculation detection with conditional random fields. Communications in Computer and Information Science, 2013, 400: 30–40
CrossRef Google scholar
[23]
Qian Z, Zou B, Li P, Zhu Q. The prediction method of rise or fall in stock markets based on the discrimination of information credibility. In: Proceedings of the 20th China Conference on Information Retrieval. 2014
[24]
Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. 2004, 271–278
CrossRef Google scholar
[25]
Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, 20: 37–46
CrossRef Google scholar
[26]
Liu L, Hong Y, Liu H, Wang X, Yao J. Effective selection of translation model training data. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Short Papers. 2014, 569–573
CrossRef Google scholar
[27]
Och F J, Ney H. A systematic comparison of various statistical alignment models. Computational Linguistics, 2003, 29(1), 19–51
CrossRef Google scholar
[28]
Jiang Z, Ng T. Semantic role labeling of NomBank: A maximum entropy approach. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2006, 138–145
CrossRef Google scholar
[29]
Zhu Q, Li J, Wang H, Zhou G. A unified framework for scope learning via simplified shallow semantic parsing. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2010, 714–724
[30]
Farkas R, Vincze V, Móra G, Csirik J, Szarvas G. The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. Proceedings of the 14th Conference on Computational Natural Language Learning. 2010
[31]
Che W, Li Z, Liu T. LTP: a Chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. 2010, 13–16

RIGHTS & PERMISSIONS

2016 Higher Education Press and Springer-Verlag Berlin Heidelberg
AI Summary AI Mindmap
PDF(547 KB)

Accesses

Citations

Detail

Sections
Recommended

/