CausalBridgeQA: a causal inference-based approach for robust enhancement of multi-hop question answering

Xu JIANG , Yu-Rong CHENG , Bao-Quan MA , Jia-Xin LI , Yun-Feng LI

Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (3) : 2003605

PDF (2123KB)
Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (3) : 2003605 DOI: 10.1007/s11704-025-41328-x
Information Systems
RESEARCH ARTICLE

CausalBridgeQA: a causal inference-based approach for robust enhancement of multi-hop question answering

Author information +
History +
PDF (2123KB)

Abstract

Multi-Hop Question Answering (MHQA) tasks require retrieving and reasoning over multiple relevant supporting facts to answer a question. However, existing MHQA models often rely on a single entity or fact to provide an answer, rather than performing true multi-hop reasoning. Additionally, during the reasoning process, models may be influenced by multiple irrelevant factors, leading to broken reasoning chains and even incorrect answers. In recent years, causal inference-based methods have gained widespread attention in bias removal research. But existing models still perform poorly when dealing the complex causal biases hidden in multi-hop evidence. To address these challenge, we propose CausalBridgeQA, a novel method that integrates multi-hop question answering with causal relationships, effectively mitigating feature spurious correlations and the problem of broken reasoning chains. Specifically, we first extract causal relationships from the input text context, then compile these relationships into causal questions containing higher-level semantic information and feed them into MHQA reasoning system. Finally, we design a knowledge compensation mechanism in the reading comprehension module of the MHQA system to specifically address questions that are difficult to answer or frequently answered incorrectly, significantly improving the performance of MHQA tasks. Finally, a series of experiments conducted on three real-world QA datasets verified the effectiveness of our proposed method.

Graphical abstract

Keywords

multi-hop question answering / causal inferenc / explainable artificial intelligences

Cite this article

Download citation ▾
Xu JIANG, Yu-Rong CHENG, Bao-Quan MA, Jia-Xin LI, Yun-Feng LI. CausalBridgeQA: a causal inference-based approach for robust enhancement of multi-hop question answering. Front. Comput. Sci., 2026, 20(3): 2003605 DOI:10.1007/s11704-025-41328-x

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Liu X, Shen Y, Duh K, Gao J. Stochastic answer networks for machine reading comprehension. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2017

[2]

Rajpurkar P, Zhang J, Lopyrev K, Liang P. SQuAD: 100,000+ questions for machine comprehension of text. In: Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. 2016

[3]

Welbl J, Stenetorp P, Riedel S . Constructing datasets for multi-hop reading comprehension across documents. Transactions of the Association for Computational Linguistics, 2018, 6: 287–302

[4]

Yang Z, Qi P, Zhang S, Bengio Y, Cohen W, Salakhutdinov R, Manning C D. HotpotQA: a dataset for diverse, explainable multi-hop question answering. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018

[5]

Qiu L, Xiao Y, Qu Y, Zhou H, Li L, Zhang W, Yu Y. Dynamically fused graph network for multi-hop reasoning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 6140−6150

[6]

Tu M, Huang K, Wang G, Huang J, He X, Zhou B. Select, answer and explain: interpretable multi-hop reading comprehension over multiple documents. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 9073−9080

[7]

Nie Y, Wang S, Bansal M. Revealing the importance of semantic retrieval for machine reading at scale. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019

[8]

Zhao C, Xiong C, Rosset C, Song X, Bennett P N, Tiwary S. Transformer-XH: multi-evidence reasoning with eXtra hop attention. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[9]

Asai A, Hashimoto K, Hajishirzi H, Socher R, Xiong C. Learning to retrieve reasoning paths over wikipedia graph for question answering. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[10]

Asai A, Hajishirzi H. Logic-guided data augmentation and regularization for consistent question answering. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020

[11]

Fang Y, Sun S, Gan Z, Pillai R, Wang S, Liu J. Hierarchical graph network for multi-hop question answering. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020

[12]

Shao N, Cui Y, Liu T, Wang S, Hu G. Is graph structure necessary for multi-hop question answering? In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020

[13]

Xu W, Zhang H, Cai D, Lam W. Dynamic semantic graph construction and reasoning for explainable multi-hop science question answering. In: Proceedings of the Findings of the Association for Computational Linguistics. 2021

[14]

Bondarenko A, Wolska M, Heindorf S, Blübaum L, Ngomo A C N, Stein B, Braslavski P, Hagen M, Potthast M. CausalQA: a benchmark for causal question answering. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022, 3296−3308

[15]

Heindorf S, Scholten Y, Wachsmuth H, Ngonga Ngomo A C, Potthast M. CauseNet: towards a causality graph extracted from the web. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020, 3023−3030

[16]

Hassanzadeh O, Bhattacharjya D, Feblowitz M, Srinivas K, Perrone M, Sohrabi S, Katz M. Answering binary causal questions through large-scale text mining: an evaluation using cause-effect pairs from human experts. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 5003−5009

[17]

Gao P, Gao F, Wang P, Ni J C, Wang F, Fujita H . ClueReader: heterogeneous graph attention network for multi-hop machine reading comprehension. Electronics, 2023, 12( 14): 3183

[18]

Zhang C, Zhang L, Zhou D. Causal walk: debiasing multi-hop fact verification with front-door adjustment. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 2024, 19533−19541

[19]

Guo W, Gong Q, Rao Y, Lai H. Counterfactual multihop QA: a cause-effect approach for reducing disconnected reasoning. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023

[20]

Tian B, Cao Y, Zhang Y, Xing C. Debiasing NLU models via causal intervention and counterfactual reasoning. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 11376−11384

[21]

Zhu Y, Si J, Zhao Y, Zhu H, Zhou D, He Y. EXPLAIN, EDIT, GENERATE: rationale-sensitive counterfactual data augmentation for multi-hop fact verification. In: Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing. 2023

[22]

Liu C, Xiang W, Wang B. Identifying while learning for document event causality identification. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. 2024

[23]

Gao L, Choubey P K, Huang R. Modeling document-level causal structures for event causal relation identification. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 1808−1817

[24]

Lewis P, Wu Y, Liu L, Minervini P, Küttler H, Piktus A, Stenetorp P, Riedel S . PAQ: 65 million probably-asked questions and what you can do with them. Transactions of the Association for Computational Linguistics, 2021, 9: 1098–1115

[25]

Nishida K, Nishida K, Nagata M, Otsuka A, Saito I, Asano H, Tomita J. Answering while summarizing: multi-task learning for multi-hop QA with evidence extraction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019

[26]

Li X Y, Lei W J, Yang Y B. From easy to hard: two-stage selector and reader for multi-hop question answering. In: Proceedings of 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. 2023, 1−5

[27]

Li R, Wang L, Wang S, Jiang Z. Asynchronous multi-grained graph network for interpretable multi-hop reading comprehension. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 3857−3863

[28]

Kundu S, Khot T, Sabharwal A, Clark P. Exploiting explicit paths for multi-hop reading comprehension. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019

[29]

Tang Z, Shen Y, Ma X, Xu W, Yu J, Lu W. Multi-hop reading comprehension across documents with path-based graph convolutional network. In: Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. 2021

[30]

Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019

[31]

Stasaski K, Rathod M, Tu T, Xiao Y, Hearst M A. Automatically generating cause-and-effect questions from passages. In: Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications. 2021, 158−170

[32]

Qi W, Yan Y, Gong Y, Liu D, Duan N, Chen J, Zhang R, Zhou M. ProphetNet: predicting future N-gram for sequence-to-SequencePre-training. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020. 2020

[33]

OpenAI, Achiam J, Adler S, Agarwal S, Ahmad L, , . GPT-4 technical report. 2023, arXiv preprint arXiv: 2303.08774

[34]

Clark K, Luong M T, Le Q V, Manning C D. ELECTRA: pre-training text encoders as discriminators rather than generators. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[35]

Rajpurkar P, Jia R, Liang P. Know what you don’t know: unanswerable questions for SQuAD. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018

[36]

Kembhavi A, Seo M, Schwenk D, Choi J, Farhadi A, Hajishirzi H. Are you smarter than a sixth grader? Textbook question answering for multimodal machine comprehension. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 4999−5007

[37]

Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015

[38]

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. RoBERTa: a robustly optimized BERT pretraining approach. 2019, arXiv preprint arXiv: 1907.11692

[39]

Khashabi D, Min S, Khot T, Sabharwal A, Tafjord O, Clark P, Hajishirzi H. UNIFIEDQA: crossing format boundaries with a single QA system. In: Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020. 2020

[40]

Feder A, Keith K A, Manzoor E, Pryzant R, Sridhar D, Wood-Doughty Z, Eisenstein J, Grimmer J, Reichart R, Roberts M E, Stewart B M, Veitch V, Yang D . Causal inference in natural language processing: estimation, prediction, interpretation and beyond. Transactions of the Association for Computational Linguistics, 2022, 10: 1138–1158

[41]

Yuan S, Yang D, Liu J, Tian S, Liang J, Xiao Y, Xie R. Causality-aware concept extraction based on knowledge-guided prompting. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023

[42]

Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q V. XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019

[43]

Chi P H, Chung P H, Wu T H, Hsieh C C, Chen Y H, Li S W, Lee H Y. Audio albert: a lite bert for self-supervised learning of audio representation. In: Proceedings of 2021 IEEE Spoken Language Technology Workshop. 2021, 344−350

[44]

Ma J, Chai Q, Liu J, Yin Q, Wang P, Zheng Q . XTQA: span-level explanations for textbook question answering. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35( 11): 16493–16503

[45]

Ma J, Chai Q, Huang J, Liu J, You Y, Zheng Q . Weakly supervised learning for textbook question answering. IEEE Transactions on Image Processing, 2022, 31: 7378–7388

[46]

Ma J, Liu J, Wang Y, Li J, Liu T . Relation-aware fine-grained reasoning network for textbook question answering. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34( 1): 15–27

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (2123KB)

Supplementary files

Highlights

251

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/