BUEES: a bottom-up event extraction system

Xiao DING, Bing QIN, Ting LIU

PDF(482 KB)
PDF(482 KB)
Front. Inform. Technol. Electron. Eng ›› 2015, Vol. 16 ›› Issue (7) : 541-552. DOI: 10.1631/FITEE.1400405

BUEES: a bottom-up event extraction system

Author information +
History +

Abstract

Traditional event extraction systems focus mainly on event type identification and event participant extraction based on pre-specified event type paradigms and manually annotated corpora. However, different domains have different event type paradigms. When transferring to a new domain, we have to build a new event type paradigm and annotate a new corpus from scratch. This kind of conventional event extraction system requires massive human effort, and hence prevents event extraction from being widely applicable. In this paper, we present BUEES, a bottom-up event extraction system, which extracts events from the web in a completely unsupervised way. The system automatically builds an event type paradigm in the input corpus, and then proceeds to extract a large number of instance patterns of these events. Subsequently, the system extracts event arguments according to these patterns. By conducting a series of experiments, we demonstrate the good performance of BUEES and compare it to a state-of-the-art Chinese event extraction system, i.e., a supervised event extraction system. Experimental results show that BUEES performs comparably (5% higher F-measure in event type identification and 3% higher F-measure in event argument extraction), but without any human effort.

Keywords

Event extraction / Unsupervised learning / Bottom-up

Cite this article

Download citation ▾
Xiao DING, Bing QIN, Ting LIU. BUEES: a bottom-up event extraction system. Front. Inform. Technol. Electron. Eng, 2015, 16(7): 541‒552 https://doi.org/10.1631/FITEE.1400405

References

[1]
Ahn, D., 2006. The stages of event extraction. Proc. Workshop on Annotating and Reasoning about Time and Events, p.1-8.
[2]
Banko, M., Etzioni, O., 2008. The tradeoffs between open and traditional relation extraction. Proc. Annual Meeting on Association for Computational Linguistics, p.28-36.
[3]
Banko, M., Cafarella, M.J., Soderland, S., , 2007. Open information extraction for the Web. Proc. 20th Int. Joint Conf. on Artificial Intelligence, p.2670-2676.
[4]
Barzilay, R., McKeown, K.R., 2001. Extracting paraphrases from a parallel corpus. Proc. 39th Annual Meeting on Association for Computational Linguistics, p.50-57. [
CrossRef Google scholar
[5]
Chambers, N., Jurafsky, D., 2009. Unsupervised learning of narrative schemas and their participants. Proc. 47th Annual Meeting on Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing, p. 602-610.
[6]
Chambers, N., Jurafsky, D., 2011. Template-based information extraction without the templates. Proc. 49th Annual Meeting on Association for Computational Linguistics, p.976-986.
[7]
Che, W., Li, Z., Li, Y., , 2009. Multilingual dependencybased syntactic and semantic parsing. Proc. 13th Conf. on Computational Natural Language Learning, p.49-54.
[8]
Chen, Z., Ji, H., 2009. Language specific issue and feature exploration in Chinese event extraction. Proc. Annual Conf. on Association for Computational Linguistics, p.209-212.
[9]
Chinchor, N., Lewis, D.D., Hirschman, L., 1993. Evaluating message understanding systems: an analysis of the third message understanding conference (MUC-3). Comput. Ling., 19(3): 409-449.
[10]
Ding, X., Song, F., Qin, B., , 2011. Research on typical event extraction method in the field of music. J. Chin. Inform. Process., 25(2): 15-20 (in Chinese).
[11]
Ding, X., Qin, B., Liu, T., 2013. Building Chinese event type paradigm based on trigger clustering. Proc. Int. Joint Conf. on Natural Language Processing, p.311-319.
[12]
Dong, Z., Dong, Q., 2006. HowNet and the Computation of Meaning. World Scientific Publishing Company, USA.
[13]
Etzioni, O., Fader, A., Christensen, J., , 2011. Open information extraction: the second generation. Proc. 22nd Int. Joint Conf. on Artificial Intelligence, p.3-10.
[14]
Fader, A., Soderland, S., Etzioni, O., 2011. Identifying relations for open information extraction. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1535-1545.
[15]
Friedman, J.H., Bentley, J.L., Finkel, R.A., 1977. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw., 3(3): 209-226. [
CrossRef Google scholar
[16]
Grishman, R., 1997. Information extraction: techniques and challenges. In: Pazienza, M.T. (Ed.), Information Extraction: a Multidisciplinary Approach to an Emerging Information Technology. Springer Berlin Heidelberg, New York, USA, p.10-27. [
CrossRef Google scholar
[17]
Grishman, R., 2001. Adaptive information extraction and sublanguage analysis. Int. Joint Conf. on Artificial Itelligence, Workshop on Adaptive Text Extraction and Mining.
[18]
Halkidi, M., Batistakis, Y., Vazirgiannis, M., 2001. On clustering validation techniques. J. Intell. Inform. Syst., 17(2-3): 107-145. [
CrossRef Google scholar
[19]
Hasegawa, T., Sekine, S., Grishman, R., 2004. Discovering relations among named entities from large corpora. Proc. 42nd Annual Meeting on Association for Computational Linguistics, Article 415. [
CrossRef Google scholar
[20]
Hirschberg, D.S., 1977. Algorithms for the longest common subsequence problem. J. ACM, 24(4): 664-675. [
CrossRef Google scholar
[21]
Hong, Y., Zhang, J., Ma, B., , 2011. Using cross-entity inference to improve event extraction. Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, p.1127-1136.
[22]
Ibrahim, A., Katz, B., Lin, J., 2003. Extracting structural paraphrases from aligned monolingual corpora. Proc. 2nd Int. Workshop on Paraphrasing, p.57-64. [
CrossRef Google scholar
[23]
Ji, H., Grishman, R., 2008. Refining event extraction through cross-document inference. Proc. Association for Computational Linguistics, p.254-262.
[24]
Lee, C.S., Chen, Y.J., Jian, Z.W., 2003. Ontology-based fuzzy event extraction agent for Chinese e-news summarization. Expert Syst. Appl., 25(3): 431-447. [
CrossRef Google scholar
[25]
Liao, S., Grishman, R., 2010. Filtered ranking for bootstrapping in event extraction. Proc. 23rd Int. Conf. on Computational Linguistics, p.680-688.
[26]
Lin, D., Pantel, P., 2001. DIRT@SBT@discovery of inference rules from text. Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.323-328. [
CrossRef Google scholar
[27]
Liu, T., Ma, J., Zhang, H., , 2007. Subdividing verbs to improve syntactic parsing. J. Electron. (China), 24(3): 347-352 (in Chinese). [
CrossRef Google scholar
[28]
Mei, J.J., Zhu, Y.M., Gao, Y.Q., , 1983. Dictionary of Synonymous Words. Shanghai Dictionary Publishing Press, Shanghai, China (in Chinese).
[29]
Miller, S., Guinness, J., Zamanian, A., 2004. Name tagging with word clusters and discriminative training. Proc. Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.337-342.
[30]
Miwa, M., Sære, R., Kim, J.D., , 2010. Event extraction with complex event classification using rich features. J. Bioinform. Comput. Biol., 8(1): 131-146. [
CrossRef Google scholar
[31]
Pang, B., Knight, K., Marcu, D., 2003. Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences. Proc. Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.102-109. [
CrossRef Google scholar
[32]
Patwardhan, S., Riloff, E., 2006. Learning domain-specific information extraction patterns from the Web. Proc. Workshop on Information Extraction Beyond the Document, p.66-73.
[33]
Pham, X., Le, M., Ho, B., 2013. A hybrid approach for biomedical event extraction. Proc. Association for Computational Linguistics, p.121-124.
[34]
Poon, H., Domingos, P., 2008. Joint unsupervised coreference resolution with Markov logic. Proc. Conf. on Empirical Methods in Natural Language Processing, p.650-659.
[35]
Poon, H., Domingos, P., 2009. Unsupervised semantic parsing. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1-10.
[36]
Riloff, E., 1996. Automatically generating extraction patterns from untagged text. Proc. AAAI, p.1044-1049.
[37]
Ritter, A., Mausam, Etzioni, O., , 2012. Open domain event extraction from Twitter. Proc. 18th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.1104-1112. [
CrossRef Google scholar
[38]
Rosenfeld, B., Feldman, R., 2006. URES: an unsupervised web relation extraction system. Proc. COLING/ACL on Main Conference Poster Sessions, p.667-674.
[39]
Schilder, F., 2007. Event extraction and temporal reasoning in legal documents. In: Schilder, F., Katz, G., Pustejovsky, J. (Eds.), Annotating, Extracting and Reasoning about Time and Events, p.55-71. [
CrossRef Google scholar
[40]
Shinyama, Y., Sekine, S., 2006. Preemptive information extraction using unrestricted relation discovery. Proc. Conf. of the North American Chapter of the Association of Computational Linguistics on Human Language Technology, p.304-311. [
CrossRef Google scholar
[41]
Soderland, S., 1999. Learning information extraction rules for semi-structured and free text. Mach. Learn., 34(1-3): 233-272. [
CrossRef Google scholar
[42]
Stevenson, M., Greenwood, M.A., 2005. A semantic approach to IE pattern induction. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.379-386. [
CrossRef Google scholar
[43]
Sudo, K., Sekine, S., Grishman, R., 2003. An improved extraction pattern representation model for automatic IE pattern acquisition. Proc. 41st Annual Meeting on Association for Computational Linguistics, p.224-231. [
CrossRef Google scholar
[44]
Wagner, W., Schmid, H., im Walde, S.S., 2009. Verb sense disambiguation using a predicate-argument-clustering model. Proc. CogSci Workshop on Distributional Semantics Beyond Concrete Concepts, p.23-28.
[45]
Wu, F., Weld, D.S., 2010. Open information extraction using Wikipedia. Proc. 48th Annual Meeting of the Association for Computational Linguistics, p.118-127.
[46]
Yangarber, R., Grishman, R., Tapanainen, P., , 2000. Automatic acquisition of domain knowledge for information extraction. Proc. 18th Conf. on Computational Linguistics, p.940-946. [
CrossRef Google scholar
[47]
Yates, A., Etzioni, O., 2009. Unsupervised methods for determining object and relation synonyms on the web. J. Artif. Intell. Res., 34(1): 255-296.
[48]
Yeh, A., Hirschman, L., Morgan, A., 2002. Background and overview for KDD Cup 2002 task 1: information extraction from biomedical articles. ACM SIGKDD Explor. Newslett., 4(2): 87-89. [
CrossRef Google scholar
PDF(482 KB)

Accesses

Citations

Detail

Sections
Recommended

/