Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese
Sheng XU, Peifeng LI, Qiaoming ZHU
Incorporating contextual evidence to improve implicit discourse relation recognition in Chinese
The discourse analysis task, which focuses on understanding the semantics of long text spans, has received increasing attention in recent years. As a critical component of discourse analysis, discourse relation recognition aims to identify the rhetorical relations between adjacent discourse units (e.g., clauses, sentences, and sentence groups), called arguments, in a document. Previous works focused on capturing the semantic interactions between arguments to recognize their discourse relations, ignoring important textual information in the surrounding contexts. However, in many cases, more than capturing semantic interactions from the texts of the two arguments are needed to identify their rhetorical relations, requiring mining more contextual clues. In this paper, we propose a method to convert the RST-style discourse trees in the training set into dependency-based trees and train a contextual evidence selector on these transformed structures. In this way, the selector can learn the ability to automatically pick critical textual information from the context (i.e., as evidence) for arguments to assist in discriminating their relations. Then we encode the arguments concatenated with corresponding evidence to obtain the enhanced argument representations. Finally, we combine original and enhanced argument representations to recognize their relations. In addition, we introduce auxiliary tasks to guide the training of the evidence selector to strengthen its selection ability. The experimental results on the Chinese CDTB dataset show that our method outperforms several state-of-the-art baselines in both micro and macro F1 scores.
discourse parsing / discourse relation recognition / contextual evidence selection
Sheng Xu received the MS degree from Soochow University, China in 2019. He is now a PhD student in the School of Computer Science and Technology at Soochow University, China. His research interests include discourse relation recognition and event relation extraction
Peifeng Li received his PhD degree in Computer Science from Soochow University, China in 2006. He has been a Professor in the School of Computer Science and Technology at Soochow University, China since 2015. His research interests include Chinese computing, information extraction, etc
Qiaoming Zhu received his PhD degree in Computer Science from Soochow University, China in 2006. He is now a Professor in the School of Computer Science and Technology at Soochow University, China. His research interests include Chinese computing, discourse analysis, etc
[1] |
Pitler E, Nenkova A. Using syntax to disambiguate explicit discourse connectives in text. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. 2009, 13−16
|
[2] |
Lin Z, Kan M Y, Ng H T. Recognizing implicit discourse relations in the Penn discourse treebank. In: Proceedings of 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 343−351
|
[3] |
Wang C, Wang B. An end-to-end topic-enhanced self-attention network for social emotion classification. In: Proceedings of the Web Conference 2020. 2020, 2210−2219
|
[4] |
Webber B, Popescu-Belis A, Tiedemann J. Proceedings of the third workshop on discourse in machine translation. In: Proceedings of the 3rd Workshop on Discourse in Machine Translation. 2017
|
[5] |
Xu J, Gan Z, Cheng Y, Liu J. Discourse-aware neural extractive text summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 5021−5031
|
[6] |
Liu Y, Li S. Recognizing implicit discourse relations via repeated reading: Neural networks with multi-level attention. In: Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 1224−1233
|
[7] |
Guo F, He R, Jin D, Dang J, Wang L, Li X. Implicit discourse relation recognition using neural tensor network with interactive attention and sparse learning. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 547−558
|
[8] |
Liu X, Ou J, Song Y, Jiang X. On the importance of word and sentence representation learning in implicit discourse relation classification. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2021, 530
|
[9] |
Xiang W, Wang B, Dai L, Mo Y. Encoding and fusing semantic connection and linguistic evidence for implicit discourse relation recognition. In: Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022. 2022, 3247−3257
|
[10] |
Qin L, Zhang Z, Zhao H, Hu Z, Xing E. Adversarial connective-exploiting networks for implicit discourse relation classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1006−1017
|
[11] |
Huang H P, Li J J. Unsupervised adversarial domain adaptation for implicit discourse relation classification. In: Proceedings of the 23rd Conference on Computational Natural Language Learning. 2019, 686−695
|
[12] |
Liu Y, Li S, Zhang X, Sui Z. Implicit discourse relation classification via multi-task neural networks. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2750−2756
|
[13] |
Lan M, Wang J, Wu Y, Niu Z Y, Wang H. Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In: Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing. 2017, 1299−1308
|
[14] |
Xu S, Li P, Kong F, Zhu Q, Zhou G. Topic tensor network for implicit discourse relation recognition in Chinese. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 608−618
|
[15] |
He R, Wang J, Guo F, Han Y. TransS-driven joint learning architecture for implicit discourse relation recognition. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 139−148
|
[16] |
Jiang F, Fan Y, Chu X, Li P, Zhu Q. Not just classification: Recognizing implicit discourse relation on joint modeling of classification and generation. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 2418−2431
|
[17] |
Shi W, Yung F, Rubino R, Demberg V. Using explicit discourse connectives in translation for implicit discourse relation classification. In: Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017, 484−495
|
[18] |
Xu Y, Hong Y, Ruan H, Yao J, Zhang M, Zhou G. Using active learning to expand training data for implicit discourse relation recognition. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 725−731
|
[19] |
Dou Z, Hong Y, Sun Y, Zhou G. CVAE-based re-anchoring for implicit discourse relation classification. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021. 2021, 1275−1283
|
[20] |
Dai Z, Huang R. A regularization approach for incorporating event knowledge and coreference relations into neural discourse parsing. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 2976−2987
|
[21] |
Guo F, He R, Dang J, Wang J. Working memory-driven neural networks with a novel knowledge enhancement paradigm for implicit discourse relation recognition. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 7822−7829
|
[22] |
Zhang Y, Meng F, Li P, Jian P, Zhou J. Context tracking network: Graph-based context modeling for implicit discourse relation recognition. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 1592−1599
|
[23] |
Isonuma M, Mori J, Sakata I. Unsupervised neural single-document summarization of reviews via learning latent discourse structure and its ranking. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 2142−2152
|
[24] |
Karimi H, Tang J. Learning hierarchical discourse-level structure for fake news detection. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 3432−3442
|
[25] |
Hirao T, Yoshida Y, Nishino M, Yasuda N, Nagata M. Single-document summarization as a tree knapsack problem. In: Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing. 2013, 1515−1520
|
[26] |
Liu Y, Lapata M . Learning structured text representations. Transactions of the Association for Computational Linguistics, 2018, 6: 63–75
|
[27] |
Ferracane E, Durrett G, Li J J, Erk K. Evaluating discourse in structured text representations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 646−653
|
[28] |
Prasad R, Dinesh N, Lee A, Miltsakaki E, Robaldo L, Joshi A, Webber B. The Penn discourse TreeBank 2.0. In: Proceedings of the 6th International Conference on Language Resources and Evaluation. 2008, 2961−2968
|
[29] |
Carlson L, Marcu D, Okurowski M E. Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: Proceedings of the SIGDIAL 2001 Workshop, the 2nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2001
|
[30] |
Pitler E, Louis A, Nenkova A. Automatic sense prediction for implicit discourse relations in text. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009, 683−691
|
[31] |
Wang Y, Li S, Wang H. A two-stage parsing method for text-level discourse analysis. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 184−188
|
[32] |
Bai H, Zhao H. Deep enhanced representation for implicit discourse relation recognition. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 571−583
|
[33] |
Lin X, Joty S, Jwalapuram P, Bari M S. A unified linear-time framework for sentence-level discourse parsing. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4190−4200
|
[34] |
Ruan H, Hong Y, Xu Y, Huang Z, Zhou G, Zhang M. Interactively-propagative attention learning for implicit discourse relation recognition. In: Proceedings of the 28th International Conference on Computational Linguistics. 2020, 3168−3178
|
[35] |
Lu Y, Hong Y, Li X, Zhou G. Implicit discourse relation recognition based on multi-granularity context fusion mechanism. In: Proceedings of the 19th Pacific Rim International Conference on Artificial Intelligence. 2022, 347−358
|
[36] |
Zhou Z M, Xu Y, Niu Z Y, Lan M, Su J, Tan C L. Predicting discourse connectives for implicit discourse relation recognition. In: Proceedings of the COLING 2010. 2010, 1507−1514
|
[37] |
Chen J, Zhang Q, Liu P, Qiu X, Huang X. Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 1726−1735
|
[38] |
Lei W, Wang X, Liu M, Ilievski I, He X, Kan M Y. SWIM: A simple word interaction model for implicit discourse relation recognition. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 4026−4032
|
[39] |
Zhang B, Su J, Xiong D, Lu Y, Duan H, Yao J. Shallow convolutional neural network for implicit discourse relation recognition. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 2230−2235
|
[40] |
Qin L, Zhang Z, Zhao H. A stacking gated neural architecture for implicit discourse relation classification. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 2263−2270
|
[41] |
Dai Z, Huang R. Improving implicit discourse relation classification by modeling inter-dependencies of discourse units in a paragraph. In: Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018, 141−151
|
[42] |
Nguyen L H, Van Ngo L, Than K, Nguyen T H. Employing the correspondence of relations and connectives to identify implicit discourse relations via label embeddings. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4201−4207
|
[43] |
Jiang C, Qian T, Chen Z, Tang K, Zhan S, Zhan T. Generating pseudo connectives with MLMs for implicit discourse relation recognition. In: Proceedings of the 18th Pacific Rim International Conference on Artificial Intelligence. 2021, 113−126
|
[44] |
Xiang W, Wang Z, Dai L, Wang B. ConnPrompt: Connective-cloze prompt learning for implicit discourse relation recognition. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022, 902−911
|
[45] |
Li Y, Feng W, Sun J, Kong F, Zhou G. Building Chinese discourse corpus with connective-driven dependency tree structure. In: Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 2105−2114
|
[46] |
Zhou Y, Xue N . The Chinese discourse TreeBank: A Chinese corpus annotated with discourse relations. Language Resources and Evaluation, 2015, 49( 2): 397–431
|
[47] |
Kong F, Zhou G . A CDT-styled end-to-end Chinese discourse parser. ACM Transactions on Asian and Low-Resource Language Information Processing, 2017, 16( 4): 26
|
[48] |
Rönnqvist S, Schenk N, Chiarcos C. A recurrent neural model with attention for the recognition of Chinese implicit discourse relations. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 256−262
|
[49] |
Liu Y, Zhang J, Zong C. Memory augmented attention model for Chinese implicit discourse relation recognition. In: Proceedings of the 16th Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. 2017, 411−423
|
[50] |
Munir K, Bai H, Zhao H, Zhao J . Memorizing all for implicit discourse relation recognition. Transactions on Asian and Low-Resource Language Information Processing, 2021, 21( 3): 53
|
[51] |
Bhatia P, Ji Y, Eisenstein J. Better document-level sentiment analysis from RST discourse parsing. In: Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. 2015, 2212−2218
|
[52] |
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016, 1480−1489
|
[53] |
Ji Y, Smith N A. Neural discourse structure for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 996−1005
|
[54] |
Ishigaki T, Kamigaito H, Takamura H, Okumura M. Discourse-aware hierarchical attention network for extractive single-document summarization. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing. 2019, 497−506
|
[55] |
Li S, Wang L, Cao Z, Li W. Text-level discourse dependency parsing. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 25−35
|
[56] |
Yoshida Y, Suzuki J, Hirao T, Nagata M. Dependency-based discourse parser for single-document summarization. In: Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing. 2014, 1834−1839
|
[57] |
Hewlett D, Lacoste A, Jones L, Polosukhin I, Fandrianto A, Han J, Kelcey M, Berthelot D. WikiReading: A novel large-scale language understanding task over wikipedia. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016, 1535−1545
|
[58] |
Dong L, Yang N, Wang W, Wei F, Liu X, Wang Y, Gao J, Zhou M, Hon H W. Unified language model pre-training for natural language understanding and generation. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, 1170
|
[59] |
Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171−4186
|
[60] |
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000−6010
|
[61] |
Ziegler Z M, Melas-Kyriazi L, Gehrmann S, Rush A M. Encoder-agnostic adaptation for conditional language generation. 2019, arXiv preprint arXiv: 1908.06938
|
[62] |
De Vries H, Strub F, Mary J, Larochelle H, Pietquin O, Courville A C. Modulating early visual processing by language. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6597−6607
|
[63] |
Zhang M, Song Y, Qin B, Liu T . Chinese discourse relation recognition. Journal of Chinese Information Processing, 2013, 27( 6): 51–58
|
[64] |
Tian W H, Gao Y Q, Huang H W, Li Z W, Zhang Z Y . Implicit discourse relation analysis based on multi-task Bi-LSTM. Journal of Chinese Information Processing, 2019, 33( 5): 47–53
|
[65] |
Wei J, Ren X, Li X, Huang W, Liao Y, Wang Y, Lin J, Jiang X, Chen X, Liu Q. NEZHA: Neural contextualized representation for Chinese language understanding. 2019, arXiv preprint arXiv: 1909.00204
|
[66] |
Miyato T, Dai A M, Goodfellow I. Adversarial training methods for semi-supervised text classification. 2016, arXiv preprint arXiv: 1605.07725
|
[67] |
Dauphin Y N, Fan A, Auli M, Grangier D. Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 933−941
|
[68] |
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu P J . Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21( 1): 140
|
[69] |
Kishimoto Y, Murawaki Y, Kurohashi S. Adapting BERT to implicit discourse relation classification with a focus on discourse connectives. In: Proceedings of the 12th Language Resources and Evaluation Conference. 2020, 1152−1158
|
[70] |
Tang Y T, Li Y B, Liu L, Yu Z H, Chen L . Feature learning by distant supervision for fine-grained implicit discourse relation identification. Acta Scientiarum Naturalium Universitatis Pekinensis, 2019, 55( 1): 91–97
|
[71] |
Guo F, He R, Dang J . Implicit discourse relation recognition via a BiLSTM-CNN architecture with dynamic chunk-based max pooling. IEEE Access, 2019, 7: 169281–169292
|
/
〈 | 〉 |