Robust long-tailed learning under label noise

Tong WEI , Jiang-Xin SHI , Min-Ling ZHANG , Yu-Feng LI

Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (1) : 2001321

PDF (1004KB)
Front. Comput. Sci. ›› 2026, Vol. 20 ›› Issue (1) : 2001321 DOI: 10.1007/s11704-025-40860-0
Excellent Young Computer Scientists Forum
RESEARCH ARTICLE

Robust long-tailed learning under label noise

Author information +
History +
PDF (1004KB)

Abstract

Long-tailed learning aims to enhance the generalization performance of underrepresented tail classes. However, previous methods have largely overlooked the prevalence of noisy labels in training data. In this paper, we address the challenge of noisy labels in long-tailed learning. We identify a critical issue: the commonly used small-loss noisy label detection criterion fails to perform effectively in long-tailed class distributions. This failure arises from the inherent bias of deep neural networks, which tend to misclassify tail class examples as head classes, leading to unreliable loss calculations. To mitigate this, we propose a novel small-distance criterion that leverages the robustness of learned representations, enabling more accurate identification of correctly-labeled examples across both head and tail classes. Additionally, to improve training for tail classes, we replace discrete pseudo-labels with label distributions for examples flagged as noisy, resulting in significant performance gains. Based on these contributions, we introduce the robust long-tail learning framework, designed to train models that are resilient to both class imbalance and noisy labels. Extensive experiments on benchmark and real-world datasets demonstrate that our approach outperforms previous methods, offering substantial performance improvements. Our source code is available at the website of github.com/Stomach-ache/RoLT

Graphical abstract

Keywords

long-tail learning / noisy labels / semi-supervised learning

Cite this article

Download citation ▾
Tong WEI, Jiang-Xin SHI, Min-Ling ZHANG, Yu-Feng LI. Robust long-tailed learning under label noise. Front. Comput. Sci., 2026, 20(1): 2001321 DOI:10.1007/s11704-025-40860-0

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Van Horn G, Aodha O M, Song Y, Cui Y, Sun C, Shepard A, Adam H, Perona P, Belongie S J. The iNaturalist species classification and detection dataset. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8769−8778

[2]

Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu S X. Large-scale long-tailed recognition in an open world. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2532−2541

[3]

Tan J, Wang C, Li B, Li Q, Ouyang W, Yin C, Yan J. Equalization loss for long-tailed object recognition. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 11659−11668

[4]

Gupta A, Dollár P, Girshick R. LVIS: a dataset for large vocabulary instance segmentation. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 5351−5359

[5]

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000−6010

[6]

Wei T, Li Y F. Does tail label help for large-scale multi-label learning? IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(7): 2315−2324

[7]

Cardie C, Nowe N. Improving minority class prediction using case-specific feature weights. In: Proceedings of the 14th International Conference on Machine Learning. 1997, 57−65

[8]

Zhou Z H, Liu X Y . Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering, 2006, 18( 1): 63–77

[9]

He H, Garcia E A . Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21( 9): 1263–1284

[10]

Wang Y, Ramanan D, Hebert M. Learning to model the tail. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 7032−7042

[11]

Cui Y, Jia M, Lin T Y, Song Y, Belongie S. Class-balanced loss based on effective number of samples. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9260−9269

[12]

Wang X, Lian L, Miao Z, Liu Z, Yu S X. Long-tailed recognition by routing diverse distribution-aware experts. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[13]

Wei T, Tu W W, Li Y F, Yang G P. Towards robust prediction on tail labels. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021, 1812−1820

[14]

Wei T, Gan K. Towards realistic long-tailed semi-supervised learning: consistency is all you need. In: Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023, 3469−3478

[15]

Shi J X, Wei T, Zhou Z, Shao J J, Han X Y, Li Y F. Long-tail learning with foundation model: heavy fine-tuning hurts. In: Proceedings of the 41st International Conference on Machine Learning. 2024, 45014−45039

[16]

Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y. Decoupling representation and classifier for long-tailed recognition. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[17]

Wu T, Huang Q, Liu Z, Wang Y, Lin D. Distribution-balanced loss for multi-label classification in long-tailed datasets. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 162−178

[18]

Wu T, Liu Z, Huang Q, Wang Y, Lin D. Adversarial robustness under long-tailed distribution. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 8655−8664

[19]

Li W, Wang L, Li W, Agustsson E, Van Gool L. WebVision database: visual learning and understanding from web data. 2017, arXiv preprint arXiv: 1708.02862

[20]

Li W, Niu L, Xu D. Exploiting privileged information from web data for image categorization. In: Proceedings of the 13th European Conference on Computer Vision. 2014, 437−452

[21]

Ma X, Wang Y, Houle M E, Zhou S, Erfani S M, Xia S T, Wijewickrema S N R, Bailey J. Dimensionality-driven learning with noisy labels. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 3361−3370

[22]

Xu Y, Cao P, Kong Y, Wang Y. LDMI: a novel information-theoretic loss function for training deep nets robust to label noise. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 559

[23]

Yao Y, Liu T, Han B, Gong M, Deng J, Niu G, Sugiyama M. Dual T: reducing estimation error for transition matrix in label-noise learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 609

[24]

Xia X, Liu T, Han B, Gong C, Wang N, Ge Z, Chang Y. Robust early-learning: hindering the memorization of noisy labels. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[25]

Li J, Xiong C, Hoi S C H. MoPro: Webly supervised learning with momentum prototypes. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[26]

Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning requires rethinking generalization. In: Proceedings of the 5th International Conference on Learning Representations. 2017

[27]

Natarajan N, Dhillon I S, Ravikumar P, Tewari A. Learning with noisy labels. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2013, 1196−1204

[28]

Frenay B, Verleysen M . Classification in the presence of label noise: a survey. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25( 5): 845–869

[29]

Wang D B, Wen Y, Pan L, Zhang M L. Learning from noisy labels with complementary loss functions. In: Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence. 2021, 10111−10119

[30]

Song H, Kim M, Park D, Shin Y, Lee J G . Learning from noisy labels with deep neural networks: a survey. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34( 11): 8135–8153

[31]

Liu T, Tao D . Classification with noisy labels by importance reweighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38( 3): 447–461

[32]

Hendrycks D, Mazeika M, Wilson D, Gimpel K. Using trusted data to train deep networks on labels corrupted by severe noise. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 10477−10486

[33]

Wang Y, Liu W, Ma X, Bailey J, Zha H, Song L, Xia S T. Iterative learning with open-set noisy labels. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8688−8696

[34]

Yao Q, Yang H, Han B, Niu G, Kwok J T Y. Searching to exploit memorization effect in learning with noisy labels. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 10789−10798

[35]

Lee K, Yun S, Lee K, Lee H, Li B, Shin J. Robust inference via generative classifiers for handling noisy labels. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 3763−3772

[36]

Jiang L, Zhou Z, Leung T, Li L J, Li F F. MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 2309−2318

[37]

Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I W, Sugiyama M. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8536−8546

[38]

Arazo E, Ortego D, Albert P, O’Connor N, McGuinness K. Unsupervised label noise modeling and loss correction. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 312−321

[39]

Li J, Socher R, Hoi S C H. DivideMix: learning with noisy labels as semi-supervised learning. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[40]

Xia X, Liu T, Han B, Gong M, Yu J, Niu G, Sugiyama M. Sample selection with uncertainty of losses for learning with noisy labels. In: Proceedings of the Tenth International Conference on Learning Representations. 2022

[41]

Wu P, Zheng S, Goswami M, Metaxas D, Chen C. A topological filter for learning with label noise. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1795

[42]

Wu Z F, Wei T, Jiang J, Mao C, Tang M, Li Y F. NGC: a unified framework for learning with open-world noisy data. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. 2021, 62−71

[43]

Tang Y, Pan Z, Hu X, Pedrycz W, Chen R . Knowledge-induced multiple kernel fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45( 12): 14838–14855

[44]

Yang Y, Xu Z. Rethinking the value of labels for improving class-imbalanced learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1618

[45]

Zhang M L, Li Y K, Yang H, Liu X Y . Towards class-imbalance aware multi-label learning. IEEE Transactions on Cybernetics, 2022, 52( 6): 4459–4471

[46]

Shen L, Lin Z, Huang Q. Relay backpropagation for effective learning of deep convolutional neural networks. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 467−482

[47]

Zhou B, Cui Q, Wei X S, Chen Z M. BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 9716−9725

[48]

Ye H J, Chen H Y, Zhan D C, Chao W L. Identifying and compensating for feature deviation in imbalanced deep learning. 2020, arXiv preprint arXiv: 2001.01385

[49]

Tang K, Huang J, Zhang H. Long-tailed classification by keeping the good and removing the bad momentum causal effect. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 128

[50]

Menon A K, Jayasumana S, Rawat A S, Jain H, Veit A, Kumar S. Long-tail learning via logit adjustment. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[51]

Cao K, Wei C, Gaidon A, Arechiga N, Ma T. Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 140

[52]

Shu J, Xie Q, Yi L, Zhao Q, Zhou S, Xu Z, Meng D. Meta-weight-net: learning an explicit mapping for sample weighting. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 172

[53]

Jamal M A, Brown M, Yang M H, Wang L, Gong B. Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 7607−7616

[54]

Ren J, Yu C, Sheng S, Ma X, Zhao H, Yi S, Li H. Balanced meta-softmax for long-tailed visual recognition. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 351

[55]

Cao K, Chen Y, Lu J, Aréchiga N, Gaidon A, Ma T. Heteroskedastic and imbalanced deep learning with adaptive regularization. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[56]

Tanaka D, Ikami D, Yamasaki T, Aizawa K. Joint optimization framework for learning with noisy labels. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 5552−5560

[57]

Wang Y, Ma X, Chen Z, Luo Y, Yi J, Bailey J. Symmetric cross entropy for robust learning with noisy labels. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 322−330

[58]

Kim Y, Yim J, Yun J, Kim J. NLNL: negative learning for noisy labels. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 101−110

[59]

Yi K, Wu J. Probabilistic end-to-end noise correction for learning with noisy labels. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 7010−7018

[60]

Nguyen D T, Mummadi C K, Ngo T P N, Nguyen T H P, Beggel L, Brox T. SELF: learning to filter noisy labels with self-ensembling. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[61]

Pleiss G, Zhang T, Elenberg E R, Weinberger K Q. Identifying mislabeled data using the area under the margin ranking. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1430

[62]

Li H T, Wei T, Yang H, Hu K, Peng C, Sun L B, Cai X L, Zhang M L. Stochastic feature averaging for learning with long-tailed noisy labels. In: Proceedings of Proceedings of the 32nd International Joint Conference on Artificial Intelligence. 2023, 434

[63]

Goldberger J, Roweis S, Hinton G, Salakhutdinov R. Neighbourhood components analysis. In: Proceedings of the 18th International Conference on Neural Information Processing Systems. 2004, 513−520

[64]

Samuel D, Chechik G. Distributional robustness loss for long-tail learning. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. 2021, 9475−9484

[65]

Permuter H, Francos J, Jermyn I . A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern Recognition, 2006, 39( 4): 695–706

[66]

Lukasik M, Bhojanapalli S, Menon A, Kumar S. Does label smoothing mitigate label noise? In: Proceedings of the 37th International Conference on Machine Learning. 2020, 6448−6458

[67]

Zhong Z, Cui J, Liu S, Jia J. Improving calibration for long-tailed recognition. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 16484−16493

[68]

Laine S, Aila T. Temporal ensembling for semi-supervised learning. In: Proceedings of the 5th International Conference on Learning Representations. 2017

[69]

Liu S, Niles-Weed J, Razavian N, Fernandez-Granda C. Early-learning regularization prevents memorization of noisy labels. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1707

[70]

Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4080−4090

[71]

Li J, Zhou P, Xiong C, Hoi S C H. Prototypical contrastive learning of unsupervised representations. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[72]

Chen P, Liao B, Chen G, Zhang S. Understanding and utilizing deep neural networks trained with noisy labels. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 1062−1070

RIGHTS & PERMISSIONS

The Author(s) 2025. This article is published with open access at link.springer.com and journal.hep.com.cn

AI Summary AI Mindmap
PDF (1004KB)

Supplementary files

Highlights

552

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/