MiMu: mitigating multiple shortcut learning behavior of transformers

Lili ZHAO , Qi LIU , Wei CHEN , Liyi CHEN , Ruijun SUN , Min HOU , Yang WANG , Shijin WANG , Pingping REN , Jiafeng ZHOU

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (12) : 1912380

PDF (4505KB)
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (12) : 1912380 DOI: 10.1007/s11704-025-50448-3
Artificial Intelligence
RESEARCH ARTICLE

MiMu: mitigating multiple shortcut learning behavior of transformers

Author information +
History +
PDF (4505KB)

Abstract

Empirical Risk Minimization (ERM) models often rely on spurious correlations between features and labels during the learning process, leading to shortcut learning behavior that undermines robustness generalization performance. Current research mainly targets identifying or mitigating a single shortcut; however, in real-world scenarios, cues within the data are diverse and unknown. In empirical studies, we reveal that models rely more on strong shortcuts than weak ones, with their performance under multiple shortcuts typically falling between that of an individual shortcut. To address these challenges, we propose MiMu, a novel method integrated with Transformer-based ERMs designed to Mitigate Multiple shortcut learning behavior, which incorporates self-calibration strategy and self-improvement strategy. In the source model, we first propose the self-calibration strategy to prevent the model from relying on shortcuts and make overconfident predictions. Then, we design self-improvement strategy in target model to further reduce the reliance on multiple shortcuts. The random mask strategy involves randomly masking partial attention positions to diversify the focus of target model avoiding fixation on a fixed region. Meanwhile, the adaptive attention alignment module facilitates the alignment of attention weights to the calibrated source model, without the need for post-hoc attention maps or supervision. Finally, extensive experiments conducted on Natural Language Processing (NLP) and Computer Vision (CV) demonstrate the effectiveness of MiMu in improving the robustness generalization abilities.

Graphical abstract

Keywords

shortcut learning / robustness / generalizability

Cite this article

Download citation ▾
Lili ZHAO, Qi LIU, Wei CHEN, Liyi CHEN, Ruijun SUN, Min HOU, Yang WANG, Shijin WANG, Pingping REN, Jiafeng ZHOU. MiMu: mitigating multiple shortcut learning behavior of transformers. Front. Comput. Sci., 2025, 19(12): 1912380 DOI:10.1007/s11704-025-50448-3

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Sain S R . The nature of statistical learning theory. Technometrics, 1996, 38( 4): 409

[2]

Zhang Z, Wu L, Liu Q, Liu J, Huang Z, Yin Y, Zhuang Y, Gao W, Chen E . Understanding and improving fairness in cognitive diagnosis. Science China Information Sciences, 2024, 67( 5): 152106

[3]

Liu Q, Huang Z, Yin Y, Chen E, Xiong H, Su Y, Hu G . EKT: exercise-aware knowledge tracing for student performance prediction. IEEE Transactions on Knowledge and Data Engineering, 2021, 33( 1): 100–115

[4]

Sun R, Tao H, Chen Y, Liu Q . HACAN: a hierarchical answer-aware and context-aware network for question generation. Frontiers of Computer Science, 2024, 18( 5): 185321

[5]

Geirhos R, Jacobsen J H, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann F A . Shortcut learning in deep neural networks. Nature Machine Intelligence, 2020, 2( 11): 665–673

[6]

Puli A, Zhang L, Wald Y, Ranganath R. Don’t blame dataset shift! Shortcut learning due to gradients and cross entropy. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 3146

[7]

Hupkes D, Giulianelli M, Dankers V, Artetxe M, Elazar Y, Pimentel T, Christodoulopoulos C, Lasri K, Saphra N, Sinclair A, Ulmer D, Schottmann F, Batsuren K, Sun K, Sinha K, Khalatbari L, Ryskina M, Frieske R, Cotterell R, Jin Z . A taxonomy and review of generalization research in NLP. Nature Machine Intelligence, 2023, 5( 10): 1161–1174

[8]

Zhao L, Liu Q, Yue L, Chen W, Chen L, Sun R, Song C. COMI: COrrect and mitigate shortcut learning behavior in deep neural networks. In: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2024, 218−228

[9]

Ren Y, Xiong D. HuaSLIM: human attention motivated shortcut learning identification and mitigation for large language models. In: Proceedings of Findings of the Association for Computational Linguistics. 2023, 12350−12365

[10]

Du M, He F, Zou N, Tao D, Hu X . Shortcut learning of large language models in natural language understanding. Communications of the ACM, 2024, 67( 1): 110–120

[11]

Naik A, Ravichander A, Sadeh N, Rose C, Neubig G. Stress test evaluation for natural language inference. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 2340−2353

[12]

McCoy R T, Pavlick E, Linzen T. Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 3428−3448

[13]

Sanh V, Wolf T, Belinkov Y, Rush A M. Learning from others’ mistakes: avoiding dataset biases without modeling them. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[14]

Utama P A, Moosavi N S, Gurevych I. Mind the trade-off: Debiasing NLU models without degrading the in-distribution performance. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 8717−8729

[15]

Schuster T, Shah D, Yeo Y J S, Ortiz D R F, Santus E, Barzilay R. Towards debiasing fact verification models. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 3419−3425

[16]

Xiao K Y, Engstrom L, Ilyas A, Madry A. Noise or signal: the role of image backgrounds in object recognition. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[17]

Taghanaki S A, Khani A, Khani F, Gholami A, Tran L, Mahdavi-Amiri A, Hamarneh G. MaskTune: Mitigating spurious correlations by forcing to explore. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 23284−23296

[18]

Geirhos R, Rubisch P, Michaelis C, Bethge M, Wichmann F A, Brendel W. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: Proceedings of the 7th International Conference on Learning Representations. 2019

[19]

Sagawa S, Koh P W, Hashimoto T B, Liang P. Distributionally robust neural networks. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[20]

Zhang H, Cissé M, Dauphin Y N, Lopez-Paz D. Mixup: Beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations. 2018

[21]

Nam J, Cha H, Ahn S, Lee J, Shin J. Learning from failure: training debiased classifier from biased classifier. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1736

[22]

Creager E, Jacobsen J H, Zemel R S. Environment inference for invariant learning. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 2189−2200

[23]

Hermann K L, Lampinen A K. What shapes feature representations? Exploring datasets, architectures, and training. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 838

[24]

Du M, Manjunatha V, Jain R, Deshpande R, Dernoncourt F, Gu J, Sun T, Hu X. Towards interpreting and mitigating shortcut learning behavior of NLU models. In: Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021, 915−929

[25]

Yue L, Liu Q, Liu Y, Gao W, Yao F . Learning from shortcut: a shortcut-guided approach for explainable graph learning. Frontiers of Computer Science, 2025, 19( 8): 198338

[26]

Jin D, Jin Z, Zhou J T, Szolovits P. Is BERT really robust? A strong baseline for natural language attack on text classification and entailment. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 8018−8025

[27]

Lai Y, Zhang C, Feng Y, Huang Q, Zhao D. Why machine reading comprehension models learn shortcuts? In: Proceedings of the Findings of the Association for Computational Linguistics. 2021, 989−1002

[28]

Yuan Y, Zhao L, Zhang K, Zheng G, Liu Q. Do LLMs overcome shortcut learning? An evaluation of shortcut challenges in large language models. In: Proceedings of 2024 Conference on Empirical Methods in Natural Language Processing. 2024, 12188−12200

[29]

Tang R, Kong D, Huang L, Xue H. Large language models can be lazy learners: Analyze shortcuts in in-context learning. In: Proceedings of the Findings of the Association for Computational Linguistics. 2023, 4645−4657

[30]

Qi F, Chen Y, Zhang X, Li M, Liu Z, Sun M. Mind the style of text! Adversarial and backdoor attacks based on text style transfer. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 4569−4580

[31]

Beery S, Van Horn G, Perona P. Recognition in terra incognita. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 472−489

[32]

Xu Y Y, Lin C S, Wang Y C F. Bias-eliminating augmentation learning for debiased federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 20442−20452

[33]

Li Z, Evtimov I, Gordo A, Hazirbas C, Hassner T, Ferrer C C, Xu C, Ibrahim M. A Whac-a-mole dilemma: Shortcuts come in multiples where mitigating one amplifies others. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 20071−20082

[34]

Liang Z, Hu H, Zhu J. LPF: A language-prior feedback objective function for de-biased visual question answering. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2021, 1955−1959

[35]

Kirichenko P, Izmailov P, Wilson A G. Last layer re-training is sufficient for robustness to spurious correlations. In: Proceedings of the 11th International Conference on Learning Representations. 2023

[36]

Tu L, Lalwani G, Gella S, He H . An empirical study on robustness to spurious correlations using pre-trained language models. Transactions of the Association for Computational Linguistics, 2020, 8: 621–633

[37]

Si C, Zhang Z, Qi F, Liu Z, Wang Y, Liu Q, Sun M. Better robustness by more coverage: Adversarial and mixup data augmentation for robust finetuning. In: Proceedings of the Findings of the Association for Computational Linguistics. 2021, 1569−1576

[38]

Utama P A, Moosavi N S, Gurevych I. Towards debiasing NLU models from unknown biases. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020, 7597−7610

[39]

He H, Zha S, Wang H H. Unlearn dataset bias in natural language inference by fitting the residual. In: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP. 2019, 132−142

[40]

Clark C, Yatskar M, Zettlemoyer L. Don’t take the easy way out: Ensemble based methods for avoiding known dataset biases. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 4069−4082

[41]

Robinson J, Sun L, Yu K, Batmanghelich K, Jegelka S, Sra S. Can contrastive learning avoid shortcut solutions? In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 380

[42]

Wang Z, Culotta A. Robustness to spurious correlations in text classification via automatically generated counterfactuals. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 14024−14031

[43]

Zhong Z, Zheng L, Kang G, Li S, Yang Y. Random erasing data augmentation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 13001−13008

[44]

Yun S, Han D, Chun S, Oh S J, Yoo Y, Choe J. CutMix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, 6022−6031

[45]

Hendrycks D, Mu N, Cubuk E D, Zoph B, Gilmer J, Lakshminarayanan B. Augmix: A simple data processing method to improve robustness and uncertainty. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[46]

Hong H, Papanikolaou I, Parbhoo S. Do regularization methods for shortcut mitigation work as intended? In: Proceedings of the 28th International Conference on Artificial Intelligence and Statistics. 2025, 3349−3357

[47]

Liu E Z, Haghgoo B, Chen A S, Raghunathan A, Koh P W, Sagawa S, Liang P, Finn C. Just train twice: Improving group robustness without training group information. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 6781−6792

[48]

Li Z, Hoogs A, Xu C. Discover and mitigate unknown biases with debiasing alternate networks. In: Proceedings of the 17th European Conference on Computer Vision. 2022, 270−288

[49]

Arjovsky M, Bottou L, Gulrajani I, Lopez-Paz D. Invariant risk minimization. 2019, arXiv preprint arXiv: 1907.02893

[50]

Koyama M, Yamaguchi S. Out-of-distribution generalization with maximal invariant predictor. 2020, arXiv preprint arXiv: 2008.01883

[51]

Chang S, Zhang Y, Yu M, Jaakkola T S. Invariant rationalization. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1448−1458

[52]

Lin Y, Zhu S, Tan L, Cui P. ZIN: When and how to learn invariance without environment partition? In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1781

[53]

Yang Y, Zhang H, Katabi D, Ghassemi M. Change is hard: a closer look at subpopulation shift. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 39584−39622

[54]

Guo C, Pleiss G, Sun Y, Weinberger K Q. On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1321−1330

[55]

Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2818−2826

[56]

Müller R, Kornblith S, Hinton G E. When does label smoothing help? In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 422

[57]

Li K, Wu Z, Peng K C, Ernst J, Fu Y. Tell me where to look: guided attention inference network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 9215−9223

[58]

Yu R, Liu S, Wang X . Dataset distillation: a comprehensive review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46( 1): 150–170

[59]

Yang R, Guo Y, Wang J, Zhou J, Wang Y . Common knowledge learning for generating transferable adversarial examples. Frontiers of Computer Science, 2025, 19( 10): 1910359

[60]

Williams A, Nangia N, Bowman S. A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018, 1112−1122

[61]

Wang Z, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 4144−4150

[62]

Zhang Y, Baldridge J, He L. PAWS: Paraphrase adversaries from word scrambling. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 1298−1308

[63]

Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L . ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115( 3): 211–252

[64]

Hendrycks D, Basart S, Mu N, Kadavath S, Wang F, Dorundo E, Desai R, Zhu T, Parajuli S, Guo M, Song D, Steinhardt J, Gilmer J. The many faces of robustness: A critical analysis of out-of-distribution generalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 8320−8329

[65]

Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171−4186

[66]

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[67]

Pezeshki M, Kaba S O, Bengio Y, Courville A, Precup D, Lajoie G. Gradient starvation: A learning proclivity in neural networks. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 97

[68]

Loshchilov I, Hutter F. Decoupled weight decay regularization. In: Proceedings of the 7th International Conference on Learning Representations. 2019

[69]

Robbins H, Monro S . A stochastic approximation method. The Annals of Mathematical Statistics, 1951, 22: 400–407

[70]

van der Maaten L, Hinton G . Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9( 86): 2579–2605

[71]

Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 3319−3328

[72]

Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 618−626

[73]

Zhao L, Wang Y, Liu Q, Wang M, Chen W, Sheng Z, Wang S. Evaluating large language models through role-guide and self-reflection: a comparative study. In: Proceedings of the 13th International Conference on Learning Representations. 2025

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (4505KB)

Supplementary files

Highlights

350

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/