Learning from shortcut: a shortcut-guided approach for explainable graph learning

Linan YUE; Qi LIU; Ye LIU; Weibo GAO; Fangzhou YAO

doi:10.1007/s11704-024-40452-4

PDF(2435 KB)

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (8) : 198338. DOI: 10.1007/s11704-024-40452-4

Artificial Intelligence

RESEARCH ARTICLE

Learning from shortcut: a shortcut-guided approach for explainable graph learning

Linan YUE¹ ,
Qi LIU¹^,² ,
Ye LIU¹ ,
Weibo GAO¹ ,
Fangzhou YAO¹

Author information +

History +

Abstract

The remarkable success in graph neural networks (GNNs) promotes the explainable graph learning methods. Among them, the graph rationalization methods draw significant attentions, which aim to provide explanations to support the prediction results by identifying a small subset of the original graph (i.e., rationale). Although existing methods have achieved promising results, recent studies have proved that these methods still suffer from exploiting shortcuts in the data to yield task results and compose rationales. Different from previous methods plagued by shortcuts, in this paper, we propose a Shortcut-guided Graph Rationalization (SGR) method, which identifies rationales by learning from shortcuts. Specifically, SGR consists of two training stages. In the first stage, we train a shortcut guider with an early stop strategy to obtain shortcut information. During the second stage, SGR separates the graph into the rationale and non-rationale subgraphs. Then SGR lets them learn from the shortcut information generated by the frozen shortcut guider to identify which information belongs to shortcuts and which does not. Finally, we employ the non-rationale subgraphs as environments and identify the invariant rationales which filter out the shortcuts under environment shifts. Extensive experiments conducted on synthetic and real-world datasets provide clear validation of the effectiveness of the proposed SGR method, underscoring its ability to provide faithful explanations.

Graphical abstract

Keywords

explainable graph learning / graph rationalization / shortcut learning

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Linan YUE, Qi LIU, Ye LIU, Weibo GAO, Fangzhou YAO. Learning from shortcut: a shortcut-guided approach for explainable graph learning. Front. Comput. Sci., 2025, 19(8): 198338 https://doi.org/10.1007/s11704-024-40452-4

This is a preview of subscription content, contact us for subscripton.

Linan Yue received the BE degree in computer science from Hehai University, China in 2019. He is currently pursuing the PhD degree in data science with University of Science and Technology of China under the advisory of Prof. Qi Liu. He has published several papers in referred journals and conference proceedings, such as IEEE TKDE, NeurIPS, ICLR, SIGIR, SIGKDD, and WWW conference. His current research interests include graph data mining, and trustworthy AI

Qi Liu received the PhD degree from the University of Science and Technology of China (USTC), China in 2013. He is currently a professor with State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China (USTC), China. His research interests include data mining, machine learning, and trustworthy AI. He has published prolifically in refereed journals and conference proceedings (e.g., TKDE, TOIS, KDD). He is an Associate Editor of IEEE TBD and Neurocomputing, and the Young Associate Editor of FCS. He was the recipient of KDD’ 18 Best Student Paper Award and ICDM’ 11 Best Research Paper Award. He was also the recipient of China Outstanding Youth Science Foundation, in 2019

Ye Liu is currently pursuing his PhD in the School of Data Science at the University of Science and Technology of China, China under the advisory of Prof. E. Chen, and is a member of State Key Laboratory of Cognitive Intelligence. His current research interests encompass graph learning and trustworthy AI. He has published several papers in referred journals and conference proceedings, such as ACM TKDD, IJCAI, and ACL conference

Weibo Gao received his BE degree from the School of Software at Hefei University of Technology, China in 2019. He is currently pursuing a PhD in the School of Computer Science and Technology at the University of Science and Technology of China, China under the guidance of Prof. Qi Liu. He has contributed to numerous publications in reputable conference proceedings, including SIGIR, AAAI, and NeurIPS conference. His current research interests encompass data mining and trustworthy AI

Fangzhou Yao is currently pursuing her PhD in the School of Data Science at the University of Science and Technology of China, China under the advisory of Prof. Qi Liu, and is a member of State Key Laboratory of Cognitive Intelligence. Her current research interests encompass machine learning and trustworthy AI. She has published several papers in referred conference proceedings, such as IJCAI and WWW conference

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017

[2]	Wu Z, Gan Y, Xu T, Wang F . Graph-segmenter: graph transformer with boundary-aware attention for semantic segmentation. Frontiers of Computer Science, 2024, 18( 5): 185327

[3]	Liang Y, Song Q, Zhao Z, Zhou H, Gong M . BA-GNN: behavior-aware graph neural network for session-based recommendation. Frontiers of Computer Science, 2023, 17( 6): 176613

[4]	Wu Y, Huang H, Song Y, Jin H . Soft-GNN: towards robust graph neural networks via self-adaptive data utilization. Frontiers of Computer Science, 2025, 19( 4): 194311

[5]	Hu W, Fey M, Zitnik M, Dong Y, Ren H, Liu B, Catasta M, Leskovec J. Open graph benchmark: datasets for machine learning on graphs. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1855

[6]	Guo Z, Zhang C, Yu W, Herr J, Wiest O, Jiang M, Chawla N V. Few-shot graph learning for molecular property prediction. In: Proceedings of the Web Conference 2021. 2021, 2559−2567

[7]	Yehudai G, Fetaya E, Meirom E A, Chechik G, Maron H. From local structures to size generalization in graph neural networks. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 11975−11986

[8]	Ying R, Bourgeois D, You J, Zitnik M, Leskovec J. GNNExplainer: generating explanations for graph neural networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 829

[9]	Luo D, Cheng W, Xu D, Yu W, Zong B, Chen H, Zhang X. Parameterized explainer for graph neural network. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1646

[10]	Lei T, Barzilay R, Jaakkola T. Rationalizing neural predictions. In: Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. 2016, 107−117

[11]	Wang X, Wu Y X, Zhang A, He X, Chua T S. Towards multi-grained explainability for graph neural networks. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2024, 1410

[12]	Chang S, Zhang Y, Yu M, Jaakkola T S. Invariant rationalization. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1448−1458

[13]	Wu Y, Wang X, Zhang A, He X, Chua T S. Discovering invariant rationales for graph neural networks. In: Proceedings of the 10th International Conference on Learning Representations. 2022

[14]	Fan S, Wang X, Mo Y, Shi C, Tang J. Debiasing graph neural networks via learning disentangled causal substructure. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1808

[15]	Sui Y, Wang X, Wu J, Lin M, He X, Chua T S. Causal attention for interpretable and generalizable graph classification. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022, 1696−1705

[16]	Li H, Zhang Z, Wang X, Zhu W. Learning invariant graph representations for out-of-distribution generalization. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 859

[17]	Clark C, Yatskar M, Zettlemoyer L. Don’t take the easy way out: ensemble based methods for avoiding known dataset biases. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 4067−4080

[18]	Nam J, Cha H, Ahn S, Lee J, Shin J. Learning from failure: training debiased classifier from biased classifier. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1736

[19]	Li Y, Lyu X, Koren N, Lyu L, Li B, Ma X. Anti-backdoor learning: training clean models on poisoned data. In: Proceedings of the 34th Advances in Neural Information Processing Systems. 2021, 14900−14912

[20]	Arpit D, Jastrzębski S, Ballas N, Krueger D, Bengio E, Kanwal M S, Maharaj T, Fischer A, Courville A, Bengio Y, Lacoste-Julien S. A closer look at memorization in deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 233−242

[21]	Poole B, Ozair S, Van Den Oord A, Alemi A A, Tucker G. On variational bounds of mutual information. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 5171−5180

[22]	Cheng P, Hao W, Dai S, Liu J, Gan Z, Carin L. CLUB: a contrastive log-ratio upper bound of mutual information. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 166

[23]	Yue L, Liu Q, Du Y, An Y, Wang L, Chen E. DARE: disentanglement-augmented rationale extraction. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1929

[24]	van den Oord A, Li Y, Vinyals O. Representation learning with contrastive predictive coding. 2018, arXiv preprint arXiv: 1807.03748

[25]	Luo J, He M, Pan W, Ming Z . BGNN: behavior-aware graph neural network for heterogeneous session-based recommendation. Frontiers of Computer Science, 2023, 17( 5): 175336

[26]	Xiao S, Bai T, Cui X, Wu B, Meng X, Wang B . A graph-based contrastive learning framework for medicare insurance fraud detection. Frontiers of Computer Science, 2023, 17( 2): 172341

[27]	Schlichtkrull M S, De Cao N, Titov I. Interpreting graph neural networks for NLP with differentiable edge masking. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[28]	Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. In: Proceedings of the 6th International Conference on Learning Representations. 2018

[29]	Chen Y, Zhang Y, Bian Y, Yang H, Ma K, Xie B, Liu T, Han B, Cheng J. Learning causally invariant representations for out-of-distribution generalization on graphs. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1608

[30]	Li H, Wang X, Zhang Z, Zhu W. Out-of-distribution generalization on graphs: a survey. 2022, arXiv preprint arXiv: 2202.07987

[31]	Yang N, Zeng K, Wu Q, Jia X, Yan J. Learning substructure invariance for out-of-distribution molecular representations. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 942

[32]	Wang F, Liu Q, Chen E, Huang Z, Yin Y, Wang S, Su Y . NeuralCD: a general framework for cognitive diagnosis. IEEE Transactions on Knowledge and Data Engineering, 2023, 35( 8): 8312–8327

[33]	Liu G, Zhao T, Xu J, Luo T, Jiang M. Graph rationalization with environment-based augmentations. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022, 1069−1078

[34]	Tishby N, Pereira F C, Bialek W. The information bottleneck method. 2000, arXiv preprint arXiv: physics/0004057

[35]	Alemi A A, Fischer I, Dillon J V, Murphy K. Deep variational information bottleneck. In: Proceedings of the 5th International Conference on Learning Representations. 2017

[36]	Paranjape B, Joshi M, Thickstun J, Hajishirzi H, Zettlemoyer L. An information bottleneck approach for controlling conciseness in rationale extraction. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020, 1938−1952

[37]	Wu T, Ren H, Li P, Leskovec J. Graph information bottleneck. In: Proceedings of the 34th Advances in Neural Information Processing Systems. 2020, 20437−20448

[38]	Yu J, Xu T, Rong Y, Bian Y, Huang J, He R. Graph information bottleneck for subgraph recognition. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[39]	Miao S, Liu M, Li P. Interpretable and generalizable graph learning via stochastic attention mechanism. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 15524−15543

[40]	Geirhos R, Jacobsen J H, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann F A . Shortcut learning in deep neural networks. Nature Machine Intelligence, 2020, 2( 11): 665–673

[41]	Du M, He F, Zou N, Tao D, Hu X. Shortcut learning of large language models in natural language understanding: a survey. 2022, arXiv preprint arXiv: 2208.11857

[42]	Yue L, Liu Q, Wang L, An Y, Du Y, Huang Z. Interventional rationalization. In: Proceedings of 2023 Conference on Empirical Methods in Natural Language Processing. 2023, 11404−11418

[43]	Yue L, Liu Q, Du Y, Wang L, Gao W, An Y. Towards faithful explanations: Boosting rationalization with shortcuts discovery. In: Proceedings of the 12th International Conference on Learning Representations. 2024

[44]	Rashid A, Lioutas V, Rezagholizadeh M. MATE-KD: Masked adversarial TExt, a companion to knowledge distillation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021, 1062−1071

[45]	Stacey J, Minervini P, Dubossarsky H, Riedel S, Rocktäschel T. Avoiding the hypothesis-only bias in natural language inference via ensemble adversarial training. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020, 8281−8291

[46]	Arjovsky M, Bottou L, Gulrajani I, Lopez-Paz D. Invariant risk minimization. 2019, arXiv preprint arXiv: 1907.02893

[47]	Sanh V, Wolf T, Belinkov Y, Rush A M. Learning from others’ mistakes: Avoiding dataset biases without modeling them. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[48]	Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? In: Proceedings of the 7th International Conference on Learning Representations. 2019

[49]	Liu G, Inae E, Luo T, Jiang M . Rationalizing graph neural networks with data augmentation. ACM Transactions on Knowledge Discovery from Data, 2024, 18( 4): 86

[50]	Socher R, Perelygin A, Wu J, Chuang J, Manning C D, Ng A, Potts C. Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing. 2013, 1631−1642

[51]

Yu M, Chang S, Zhang Y, Jaakkola T. Rethinking cooperative rationalization: Introspective extraction and complement control. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 4094−4103

[52]	Sun R, Tao H, Chen Y, Liu Q . HACAN: a hierarchical answer-aware and context-aware network for question generation. Frontiers of Computer Science, 2024, 18( 5): 185321

[53]	Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015

Acknowledgements

This research was supported by grants from the Joint Research Project of the Science and Technology Innovation Community in Yangtze River Delta (No. 2023CSJZN0200), the National Natural Science Foundation of China (Grant No. 62337001) and the Fundamental Research Funds for the Central Universities.