Long-tail learning with context-aware re-sampling

Jiang-Xin SHI , Xiao-Chao XIAO , Cong-Zhong ZHU , Wen TAO , Wen-Yu ZHOU , Wei ZHU , Yu-Feng LI

Front. Comput. Sci. ›› 2027, Vol. 21 ›› Issue (1) : 2101301

PDF (1276KB)
Front. Comput. Sci. ›› 2027, Vol. 21 ›› Issue (1) :2101301 DOI: 10.1007/s11704-025-50835-w
Artificial Intelligence
RESEARCH ARTICLE
Long-tail learning with context-aware re-sampling
Author information +
History +
PDF (1276KB)

Abstract

Real-world data often exhibit a long-tail class distribution, where a small subset of classes dominate the majority of the training samples, while the remaining classes suffer from severe data scarcity. Long-tail learning (LTL) aims to tackle this extreme data imbalance problem and improve the generalization across both head and tail classes. Although re-sampling offers a straightforward solution to mitigate class imbalance, prior researches have empirically shown its limited effectiveness in modern long-tail learning tasks. To overcome this limitation, we propose Context-Aware RE-sampling (CARE), a novel framework that leverages large pre-trained models to suppress irrelevant contexts as well as enrich the diversity of the training data. Specifically, CARE introduces multiple practical implementations: CARE-DS, which integrates DINO and SAM to segment and transplant objects across images, generating diverse samples while preserving semantic consistency, and CARE-DM, which utilizes diffusion models to synthesize contextually diverse samples conditioned on original images and textual prompts. Extensive experiments demonstrate that CARE effectively mitigates performance deterioration for both head and tail classes, achieving significant generalization improvements over conventional re-sampling methods.

Graphical abstract

Keywords

long-tail learning / re-sampling / class-imbalanced learning / data augmentation

Cite this article

Download citation ▾
Jiang-Xin SHI, Xiao-Chao XIAO, Cong-Zhong ZHU, Wen TAO, Wen-Yu ZHOU, Wei ZHU, Yu-Feng LI. Long-tail learning with context-aware re-sampling. Front. Comput. Sci., 2027, 21(1): 2101301 DOI:10.1007/s11704-025-50835-w

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

LeCun Y, Bengio Y, Hinton G . Deep learning. Nature, 2015, 521( 7553): 436–444

[2]

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770−778

[3]

Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248−255

[4]

Wang Y X, Ramanan D, Hebert M. Learning to model the tail. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 7032−7042

[5]

Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu S X. Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2532−2541

[6]

Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y. Decoupling representation and classifier for long-tailed recognition. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[7]

Cui Y, Jia M, Lin T Y, Song Y, Belongie S. Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 9260−9269

[8]

Ren J, Yu C, Sheng S, Ma X, Zhao H, Yi S, Li H. Balanced meta-softmax for long-tailed visual recognition. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 351

[9]

Samuel D, Chechik G. Distributional robustness loss for long-tail learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 9475−9484

[10]

Yang Y, Xu Z. Rethinking the value of labels for improving class-imbalanced learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1618

[11]

Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P . SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321–357

[12]

Liu X Y, Wu J, Zhou Z H . Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2009, 39( 2): 539–550

[13]

Zhang Y, Wei X S, Zhou B, Wu J. Bag of tricks for long-tailed visual recognition with deep convolutional neural networks. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 3447−3455

[14]

Zhou B, Cui Q, Wei X S, Chen Z M. BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 9716−9725

[15]

Cao K, Wei C, Gaidon A, Arechiga N, Ma T. Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 140

[16]

Shi J X, Wei T, Xiang Y, Li Y F. How re-sampling helps for long-tail learning? In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 3307

[17]

Liu S, Zeng Z, Ren T, Li F, Zhang H, Yang J, Jiang Q, Li C, Yang J, Su H, Zhu J, Zhang L. Grounding DINO: marrying DINO with grounded pre-training for open-set object detection. In: Proceedings of the 18th European Conference on Computer Vision. 2023, 38−55

[18]

Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, Xiao T, Whitehead S, Berg A C, Lo W Y, Dollár P, Girshick R. Segment anything. In: Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 2023, 3992−4003

[19]

Elkan C. The foundations of cost-sensitive learning. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence. 2001, 973−978

[20]

Zhou Z H, Liu X Y . Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering, 2006, 18( 1): 63–77

[21]

Zhou Z, Yao J, Wang Y F, Han B, Zhang Y. Contrastive learning with boosted memorization. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 27367−27377

[22]

Zhu J, Wang Z, Chen J, Chen Y P P, Jiang Y G. Balanced contrastive learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 6898−6907

[23]

Ahn S, Ko J, Yun S Y. CUDA: curriculum of data augmentation for long-tailed recognition. In: Proceedings of the 11th International Conference on Learning Representations. 2023

[24]

Li S, Gong K, Liu C H, Wang Y, Qiao F, Cheng X. MetaSAug: meta semantic augmentation for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 5208−5217

[25]

Li Y F, Liang D M . Safe semi-supervised learning: a brief introduction. Frontiers of Computer Science, 2019, 13( 4): 669–676

[26]

Guo L Z, Jia L H, Shao J J, Li Y F . Robust semi-supervised learning in open environments. Frontiers of Computer Science, 2025, 19( 8): 198345

[27]

Wei T, Shi J X, Zhang M L, Li Y F . Robust long-tailed learning under label noise. Frontiers of Computer Science, 2026, 20( 1): 2001321

[28]

Zhou Z, Jin Y X, Li Y F . Rts: learning robustly from time series data with noisy label. Frontiers of Computer Science, 2024, 18( 6): 186332

[29]

Li S Y, Zhao S J, Cao Z T, Huang S J, Chen S . Robust domain adaptation with noisy and shifted label distribution. Frontiers of Computer Science, 2025, 19( 3): 193310

[30]

Zhong Z, Cui J, Liu S, Jia J. Improving calibration for long-tailed recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 16484−16493

[31]

Zhang H, Cissé M, Dauphin Y N, Lopez-Paz D. mixup: beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations. 2018

[32]

Chou H P, Chang S C, Pan J Y, Wei W, Juan D C. Remix: rebalanced mixup. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 95−110

[33]

Kim J, Jeong J, Shin J. M2m: imbalanced classification via major-to-minor translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 13893−13902

[34]

Park S, Hong Y, Heo B, Yun S, Choi J Y. The majority can help the minority: context-rich minority oversampling for long-tailed classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 6877−6886

[35]

Yun S, Han D, Chun S, Oh S J, Yoo Y, Choe J. CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, 6022−6031

[36]

Zhang H, Li F, Liu S, Zhang L, Su H, Zhu J, Ni L, Shum H Y. DINO: DETR with improved DeNoising anchor boxes for end-to-end object detection. In: Proceedings of the 11th International Conference on Learning Representations. 2023

[37]

Zhang Y, Zhou T, Wang S, Liang P, Zhang Y, Chen D Z. Input augmentation with SAM: boosting medical image segmentation with segmentation foundation model. In: Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention. 2023, 129−139

[38]

Dai H, Ma C, Yan Z, Liu Z, Shi E, Li Y, Shu P, Wei X, Zhao L, Wu Z, Zeng F, Zhu D, Liu W, Li Q, Sun L, Liu S Z T, Li X. SAMAug: point prompt augmentation for segment anything model. 2023, arXiv preprint arXiv: 2307.01187

[39]

Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 574

[40]

Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 10674−10685

[41]

Croitoru F A, Hondru V, Ionescu R T, Shah M . Diffusion models in vision: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45( 9): 10850–10869

[42]

Zhang Y, Zhou D, Hooi B, Wang K, Feng J. Expanding small-scale datasets with guided imagination. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 3346

[43]

Shao J, Zhu K, Zhang H, Wu J. DiffuLT: diffusion for long-tail recognition without external knowledge. In: Proceedings of the 38th International Conference on Neural Information Processing Systems. 2024, 3909

[44]

Zhang T, Zheng H, Yao J, Wang X, Zhou M, Zhang Y, Wang Y. Long-tailed diffusion models with oriented calibration. In: Proceedings of the 12th International Conference on Learning Representations. 2024

[45]

Menon A K, Jayasumana S, Rawat A S, Jain H, Veit A, Kumar S. Long-tail learning via logit adjustment. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[46]

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000−6010

[47]

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words: transformers for image recognition at scale. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[48]

Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jegou H. Training data-efficient image transformers & distillation through attention. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 10347−10357

[49]

Lin T Y, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 2999−3007

RIGHTS & PERMISSIONS

Higher Education Press

PDF (1276KB)

Supplementary files

Highlights

376

Accesses

0

Citation

Detail

Sections
Recommended

/