一种易用的实体识别消歧系统评测框架
辉 陈, 宝刚 魏, 一鸣 李, Yong-huai LIU, 文浩 朱
一种易用的实体识别消歧系统评测框架
实体识别消歧是知识库扩充和信息抽取的重要技术之一。近些年该领域诞生了很多研究成果,提出了许多实体识别消歧系统。但由于缺乏对这些系统的完善评测对比,该领域依然处于良莠淆杂的状态。因此很有必要设计一个评测框架对各个系统进行统一评测。本文提出一个实体识别消歧系统的统一评测框架,用于公平地比较各个实体识别消歧系统的效果。该框架代码开源,可以采用新的系统、数据集、评测机制扩展。通过该框架评测实体系统,可以分析得到系统各个模块的优劣之处。本文分析对比了几个公开的实体识别消歧系统,并总结出了一些有用的结论。
[1] |
Bizer , C., Lehmann , J., Kobilarov , G.,
|
[2] |
Carletta , J., 1996. Assessing agreement on classification tasks: the kappa statistic.Comput. Ling., 22(2):249–254.
|
[3] |
Cornolti , M., Ferragina , P., Ciaramita , M., 2013. A framework for benchmarking entity-annotation systems. Proc. 22nd Int. Conf. on World Wide Web, p.249–260.
|
[4] |
Finkel , J.R., Grenager , T., Manning , C., 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.363–370. http://dx.doi.org/10.3115/1219840.1219885
|
[5] |
Hachey , B., Nothman , J., Radford , W., 2014. Cheap and easy entity evaluation. Proc. 52nd Annual Meeting of the Association for Computational Linguistics, p.464–469.
|
[6] |
Hoffart , J., Yosef , M.A., Bordino , I.,
|
[7] |
Ji , H., Nothman , J., Hachey , B.,
|
[8] |
Ji , H., Nothman , J., Hachey , B.,
|
[9] |
Ling , X., Singh , S., Weld , D.S., 2015. Design challenges for entity linking.Trans. Assoc. Comput. Ling., 3:315–328.
|
[10] |
Milne , D., Witten , I.H., 2008. Learning to link with Wikipedia. Proc. 17th ACM Conf. on Information and Knowledge Management, p.509–518. http://dx.doi.org/10.1145/1458082.1458150
|
[11] |
Milne , D., Witten , I.H., 2013. An open-source toolkit for mining Wikipedia.Artif. Intell., 194:222–239. http://dx.doi.org/10.1016/j.artint.2012.06.007
|
[12] |
Ratinov , L., Roth , D., 2009. Design challenges and misconceptions in named entity recognition. Proc. 13th Conf. on Computational Natural Language Learning, p.147–155. http://dx.doi.org/10.3115/1596374.1596399
|
[13] |
Ratinov , L., Roth , D., Downey , D.,
|
[14] |
Ristad , E.S., Yianilos , P.N., 1998. Learning string-edit distance.IEEE Trans. Patt. Anal. Mach. Intell., 20(5):522–532. http://dx.doi.org/10.1109/34.682181
|
[15] |
Rizzo , G., van Erp , M., Troncy , R., 2014. Benchmarking the extraction and disambiguation of named entities on the semantic web. Proc. 9th Int. Conf. on Language Resources and Evaluation.
|
[16] |
Shen , W., Wang , J., Han , J., 2015. Entity linking with a knowledge base: issues, techniques, and solutions.IEEE Trans. Knowl. Data Eng., 27(2):443–460. http://dx.doi.org/10.1109/TKDE.2014.2327028
|
[17] |
Spitkovsky , V.I., Chang , A.X., 2012. A cross-lingual dictionary for English Wikipedia concepts. 8th Int. Conf. on Language Resources and Evaluation, p.3168–3175.
|
[18] |
Usbeck , R., Röder , M., Ngonga Ngomo , A.C.,
|
/
〈 | 〉 |