Semantic separator learning and its applications in unsupervised Chinese text parsing

Yuming WU; Xiaodong LUO; Zhen YANG

doi:10.1007/s11704-013-2072-z

Front. Comput. Sci. ›› 2013, Vol. 7 ›› Issue (1) :55 -68. DOI: 10.1007/s11704-013-2072-z

RESEARCH ARTICLE

Semantic separator learning and its applications in unsupervised Chinese text parsing

Author information +

History +

PDF (583KB)

Abstract

Grammar learning has been a bottleneck problem for a long time. In this paper, we propose a method of semantic separator learning, a special case of grammar learning. The method is based on the hypothesis that some classes of words, called semantic separators, split a sentence into several constituents. The semantic separators are represented by words together with their part-of-speech tags and other information so that rich semantic information can be involved. In the method, we first identify the semantic separators with the help of noun phrase boundaries, called subseparators. Next, the argument classes of the separators are learned from corpus by generalizing argument instances in a hypernym space. Finally, in order to evaluate the learned semantic separators, we use them in unsupervised Chinese text parsing. The experiments on a manually labeled test set show that the proposed method outperforms previous methods of unsupervised text parsing.

Keywords

semantic separator / separator learning / unsupervised text parsing

Cite this article

Download citation ▾

Yuming WU, Xiaodong LUO, Zhen YANG. Semantic separator learning and its applications in unsupervised Chinese text parsing. Front. Comput. Sci., 2013, 7(1): 55-68 DOI:10.1007/s11704-013-2072-z

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Manning C, Raghavan P, Schutze H. Introduction to information retrieval. Cambridge University Press, 2008

[2]	Croce D, Moschitti A, Basili R. Semantic convolution kernels over dependency trees: smoothed partial tree kernel. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 2013-2016

[3]	Zhang C, Cao C, Sui Y, Wu X. A Chinese time ontology for the semantic web. Knowledge-Based Systems, 2011, 24(7): 1057-1074

[4]	Liu Y, Lü Y, Liu Q. Improving tree-to-tree translation with packed forests. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009, 558-566

[5]	Zhang H, Yu H, Xiong D, Liu Q. HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing. 2003, 184-187

[6]	Gold E. Language identification in the limit. Information and Control, 1967, 10(5): 447-474

[7]	Liang P, Petrov S, Jordan M, Klein D. The infinite PCFG using hierarchical Dirichlet processes. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). 2007, 688-697

[8]	Klein D. The unsupervised learning of natural language structure. <DissertationTip/> Stanford University, 2005

[9]	Yoshinaka R. Identification in the limit of k, l-substitutable contextfree languages. Grammatical Inference: Algorithms and Applications, 2008, 266-279

[10]	Clark A, Eyraud R, Habrard A. A polynomial algorithm for the inference of context free languages. Grammatical Inference: Algorithms and Applications, 2008, 29-42

[11]	Clark A, Florêncio C, Watkins C, Serayet M. Planar languages and learnability. Grammatical Inference: Algorithms and Applications, 2006, 148-160

[12]	Clark A, Costa Florêncio C, Watkins C. Languages as hyperplanes: grammatical inference with string kernels. Machine Learning, 2011, 82(3): 351-373

[13]	Berg-Kirkpatrick T, Bouchard-Côté A, DeNero J, Klein D. Painless unsupervised learning with features. In: Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2010, 582-590

[14]	Iwata T, Mochihashi D, Sawada H. Learning common grammar from multilingual corpus. In: Proceedings of the ACL 2010 Conference Short Papers. 2010, 184-188

[15]	Berg-Kirkpatrick T, Klein D. Phylogenetic grammar induction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010, 1288-1297

[16]	Slonneger K, Kurtz B. Formal syntax and semantics of programming languages. Addison-Wesley, 1995

[17]	Abney S. Stochastic attribute-value grammars. Computational Linguistics, 1997, 23(4): 597-618

[18]	Eisele A. Towards probabilistic extensions of constraint-based grammars. Computational Aspects of Constraint-based Linguistic Description, 1994, 3-21

[19]	Brew C. Stochastic HPSG. In: Proceedings of the 7th conference on European chapter of the Association for Computational Linguistics. 1995, 83-89

[20]	Clark A, Eyraud R. Identification in the limit of substitutable contextfree languages. In: Proceedings of the 16th International Conference on Algorithmic Learning Theory. 2005, 283-296

[21]	Naseem T, Chen H, Barzilay R, Johnson M. Using universal linguistic knowledge to guide grammar induction. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 2010, 1234-1244

[22]	Naseem T, Barzilay R. Using semantic cues to learn syntax. In: Proceedings of the 25th International Conference on Artificial Intelligence. 2011

[23]	Boonkwan P, Steedman M. Grammar induction from text using small syntactic prototypes. In: Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011

[24]	Muresan S. Learning for deep language understanding. In: Proceedings of the 22nd International Joint conference on Artificial Intelligence. 2011, 1858-1865

[25]	Gavaldà M, Waibel A. Growing semantic grammars. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1. 1998, 451-456

[26]	Abisha P, Thomas D, Kumaar S. Learning subclasses of pure pattern languages. Grammatical Inference: Algorithms and Applications, 2008, 280-282

[27]	Santamaria J, Araujo L. Identifying patterns for unsupervised grammar induction. In: Proceedings of the 14th Conference on Computational Natural Language Learning. 2010, 38-45

[28]	Liu L, Zhang S, Diao L, Yan S, Cao C. Acquiring ISA relations from Chinese free text based on multiple patterns. In: Proceedings of the 5th International Conference on Fuzzy Systems and Knowledge Dis covery. 2008, 160-164

[29]	Chen C. Propositon and Its Function. Anhui Education Press, 2002 (in Chinese)

[30]	Wang S, Cao Y, Cao X, Cao C. Learning concepts from text based on the inner-constructive model. Knowledge Science, Engineering and Management, 2007, 255-266

[31]	Miao T. Encyclpedia of Music. People’s Music Press, 1998 (in Chinese)

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag Berlin Heidelberg

PDF (583KB)

1068

Accesses

Citation

Detail

Sections

Recommended

About the journal

Aims & scope

Description

Editorial board

Abstracting / indexing

Contact us

Browse

Just accepted

All volumes and issues

Collections

Featured articles

Most accessed

Most cited

Collections

Multimedia collections

Authors & reviewers

Online submisson

Call for papers

Guidelines for authors

Download templates

Guidelines for reviewers

Abstract

Keywords

Cite this article

References

RIGHTS & PERMISSIONS