Attention-based encoder-decoder model for answer selection in question answering
Yuan-ping NIE, Yi HAN, Jiu-ming HUANG, Bo JIAO, Ai-ping LI
Attention-based encoder-decoder model for answer selection in question answering
One of the key challenges for question answering is to bridge the lexical gap between questions and answers because there may not be any matching word between them. Machine translation models have been shown to boost the performance of solving the lexical gap problem between question-answer pairs. In this paper, we introduce an attention-based deep learning model to address the answer selection task for question answering. The proposed model employs a bidirectional long short-term memory (LSTM) encoder-decoder, which has been demonstrated to be effective on machine translation tasks to bridge the lexical gap between questions and answers. Our model also uses a step attention mechanism which allows the question to focus on a certain part of the candidate answer. Finally, we evaluate our model using a benchmark dataset and the results show that our approach outperforms the existing approaches. Integrating our model significantly improves the performance of our question answering system in the TREC 2015 LiveQA task.
Question answering / Answer selection / Attention / Deep learning
[1] |
Bahdanau,D., Cho,K., Bengio,Y., 2014. Neural machine translation by jointly learning to align and translate. ArXiv:1409.0473.
|
[2] |
Berger,A., Caruana, R., Cohn,D. ,
|
[3] |
Cho,K., van Merriënboer, B., Gulcehre,C. ,
|
[4] |
Cui,H., Sun,R., Li,K.,
|
[5] |
dos Santos,C., Barbosa, L., Bogdanova,D. ,
|
[6] |
Echihabi,A., Marcu, D., 2003. A noisy-channel approach to question answering. Proc. 41st Annual Meeting of the Association for Computational Linguistics, p.16–23. http://dx.doi.org/10.3115/1075096.1075099
|
[7] |
Feng,M., Xiang, B., Glass,M.R. ,
|
[8] |
Graves,A., Mohamed, A., Hinton,G.E. , 2013. Speech recognition with deep recurrent neural networks. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.6645–6649. http://dx.doi.org/10.1109/ICASSP.2013.6638947
|
[9] |
Heilman,M., Smith, N.A., 2010. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. Human Language Technologies: Annual Conf. of the North American Chapter of the Association for Computational Linguistics, p.1011–1019.
|
[10] |
Hochreiter,S., Schmidhuber, J., 1997. Long short-term memory. Neur. Comput., 9(8):1735–1780. http://dx.doi.org/10.1162/neco.1997.9.8.1735
|
[11] |
Iyyer,M., Boyd-Graber, J.L., Claudino,L.M.B. ,
|
[12] |
Jeon,J., Croft, W.B., Lee,J.H. , 2005. Finding similar questions in large question and answer archives. Proc. 14th ACM Int. Conf. on Information and Knowledge Management, p.84–90. http://dx.doi.org/10.1145/1099554.1099572
|
[13] |
Kalchbrenner,N., Blunsom, P., 2013. Recurrent continuous translation models. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1700–1709.
|
[14] |
Kim,Y., 2014. Convolutional neural networks for sentence classification.ArXiv:1408.5882.
|
[15] |
Punyakanok,V., Roth,D., Yih,W.T., 2004. Mapping dependencies trees: an application to question answering. Proc. 8th Int. Symp. on Artificial Intelligence and Mathematics, p.1–10.
|
[16] |
Riezler,S., Vasserman, A., Tsochantaridis,I. ,
|
[17] |
Robertson,S.E., Walker, S., Jones,S. ,
|
[18] |
Rush,A.M., Chopra, S., Weston,J. , 2015. A neural attention model for abstractive sentence summarization. ArXiv: 1509.00685.
|
[19] |
Severyn,A., Moschitti, A., 2013. Automatic feature engineering for answer selection and extraction. Proc. Conf. on Empirical Methods in Natural Language Processing, p.458–467.
|
[20] |
Severyn,A., Moschitti, A., 2015. Learning to rank short text pairs with convolutional deep neural networks. Proc. 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.373–382. http://dx.doi.org/10.1145/2766462.2767738
|
[21] |
Soricut,R., Brill, E., 2006. Automatic question answering using the web: beyond the factoid. Inform. Retr., 9(2):191–206. http://dx.doi.org/10.1007/s10791-006-7149-y
|
[22] |
Surdeanu,M., Ciaramita, M., Zaragoza,H. , 2011. Learning to rank answers to non-factoid questions from web collections. Comput. Ling., 37(2):351–383. http://dx.doi.org/10.1162/COLI_a_00051
|
[23] |
Sutskever,I., Vinyals, O., Le,Q.V. , 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104–3112.
|
[24] |
Wang,D., Nyberg, E., 2015. A long short-term memory model for answer sentence selection in question answering. Proc. 53rd Annual Meeting of the Association for Computational Linguistics and 7th Int. Joint Conf. on Natural Language Processing, p.707–712. http://dx.doi.org/10.3115/v1/P15-2116
|
[25] |
Wang,M., Manning, C.D., 2010. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. Proc. 23rd Int. Conf. on Computational Linguistics, p.1164–1172.
|
[26] |
Wang,M., Smith, N.A., Mitamura,T. , 2007. What is the jeopardy model? A quasi-synchronous grammar for QA. Proc. Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, p.22–32.
|
[27] |
Xu,K., Ba,J., Kiros,R.,
|
[28] |
Xue,X., Jeon,J., Croft,W.B. , 2008. Retrieval models for question and answer archives. Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.475–482. http://dx.doi.org/10.1145/1390334.1390416
|
[29] |
Yao,X., van Durme, B., Callison-Burch,C. ,
|
[30] |
Yao,X., van Durme, B., Callisonburch,C. ,
|
[31] |
Yih,W., Chang, M., Meek,C. ,
|
[32] |
Yih,W., He,X., Meek,C., 2014. Semantic parsing for singlerelation question answering. Proc. 52nd Annual Meeting of the Association for Computational Linguistics, p.643–648. http://dx.doi.org/10.3115/v1/P14-2105
|
[33] |
Yu,L., Hermann, K.M., Blunsom,P. ,
|
[34] |
Zhou,G., Cai,L., Zhao,J.,
|
[35] |
Zhou,G., Liu,F., Liu,Y.,
|
[36] |
Zhou,G., Zhou,Y., He,T.,
|
/
〈 | 〉 |