PDF
(1035KB)
Abstract
Continual learning aims to empower a model to learn new tasks continuously while reducing forgetting to retain previously learnt knowledge. In the context of receiving streaming data that are not constrained by the independent and identically distributed (IID) assumption, continual learning efficiently transforms and leverages previously learnt knowledge through various methodologies and completes the learning of new tasks. The generalisation performance and learning efficiency of the model are enhanced in a sequence of tasks. However, the class imbalance in continual learning scenarios critically undermines model performance. In particular, in the class-incremental scenario, the class imbalance results in a bias towards new task classes while degrading the performance on previous learnt classes, leading to catastrophic forgetting. In this paper, a novel method based on balanced contrast is proposed to solve the class-incremental continual learning. The method utilises gradient balancing to mitigate the impact of class imbalance in the class-incremental scenario. The method leverages contrastive learning and gradient modifications to facilitate balanced processing of data across different classes in continual learning. The method proposed in this paper surpasses the existing baseline approaches in the class-incremental learning scenario on standard image datasets such as CIFAR-100, CIFAR-10 and mini-ImageNet. The research results reveal that the proposed method effectively mitigates catastrophic forgetting of previously learnt classes, markedly improving the efficacy of continual learning and offering a powerful solution for further advancing continual learning performance.
Keywords
artificial intelligence
/
deep learning
/
deep neural networks
/
image classification
Cite this article
Download citation ▾
Shiqi Yu, Luojun Lin, Yuanlong Yu.
Balanced Contrast Class-Incremental Learning.
CAAI Transactions on Intelligence Technology, 2025, 10(6): 1867-1879 DOI:10.1049/cit2.70060
| [1] |
R. Yan, P. Li, H. Gao, J. Huang, and C. Wang, “Car-Following Strategy of Intelligent Connected Vehicle Using Extended Disturbance Observer Adjusted by Reinforcement Learning,” CAAI Transactions on Intelligence Technology 9, no. 2 (2024): 365-373, https://doi.org/10.1049/cit2.12252.
|
| [2] |
B. Fan, Y. Zhang, Y. Chen, and L. Meng, “Intelligent Vehicle Lateral Control Based on Radial Basis Function Neural Network Sliding Mode Controller,” CAAI Transactions on Intelligence Technology 7, no. 3 (2022): 455-468, https://doi.org/10.1049/cit2.12075.
|
| [3] |
X. Wang, Y. Yang, W. Wang, Y. Zhou, Y. Yin, and Z. Gong, “Generative Adversarial Networks Based Motion Learning Towards Robotic Calligraphy Synthesis,” CAAI Transactions on Intelligence Technology 9, no. 2 (2024): 452-466, https://doi.org/10.1049/cit2.12198.
|
| [4] |
M. Gheisari, F. Ebrahimzadeh, M. Rahimi, et al., “Deep Learning: Applications, Architectures, Models, Tools, and Frameworks: A Comprehensive Survey,” CAAI Transactions on Intelligence Technology 8, no. 3 (2023): 581-606, https://doi.org/10.1049/cit2.12180.
|
| [5] |
G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual Lifelong Learning With Neural Networks: A Review,” Neural Networks 113 (2019): 54-71, https://doi.org/10.1016/j.neunet.2019.01.012.
|
| [6] |
Q. Chen, C. Shui,L. Han, and M. Marchand, “On the Stability-Plasticity Dilemma in Continual Meta-Learning: Theory and Algo-rithm,” in Proceedings of the Advances in Neural Information Processing Systems Vol. 36 (2023), 27414-27468.
|
| [7] |
S. Kim, L. Noci,A. Orvieto, and T. Hofmann, “Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 11930-11939.
|
| [8] |
M. Mermillod, A. Bugaiska, and P. Bonin, “The Stability-Plasticity Dilemma: Investigating the Continuum From Catastrophic Forgetting to Age-Limited Learning Effects,” Frontiers in Psychology 4 (2013): 504, https://doi.org/10.3389/fpsyg.2013.00504.
|
| [9] |
I. J. Goodfellow, M. Mirza, D. Xiao, A. Courville, and Y. Bengio, “An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks,” preprint, arXiv:1312.6211 (2013).
|
| [10] |
T. Lesort, O. Ostapenko, P. Rodríguez, et al., “Challenging Common Assumptions About Catastrophic Forgetting and Knowledge Accumu-lation,” in Proceedings of the Conference on Lifelong Learning Agents (2023), 43-65.
|
| [11] |
E. Cetinic, T. Lipic, and S. Grgic, “Fine-Tuning Convolutional Neural Networks for Fine Art Classification,” Expert Systems with Ap-plications 114 (2018): 107-118, https://doi.org/10.1016/j.eswa.2018.07.026.
|
| [12] |
E. J. Hu, Y. Shen, P. Wallis, et al., “LoRA: Low-Rank Adaptation of Large Language Models,” in International Conference on Learning Representations (2022).
|
| [13] |
J. Zhu, Z. Wang, J. Chen,Y. P. P. Chen, and Y. G. Jiang, “Balanced Contrastive Learning for Long-Tailed Visual Recognition,” in Pro-ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 6908-6917.
|
| [14] |
G. M. Van de Ven, T. Tuytelaars, and A. S. Tolias, “Three Types of Incremental Learning,” Nature Machine Intelligence 4, no. 12 (2022):1185-1197, https://doi.org/10.1038/s42256-022-00568-3.
|
| [15] |
Z. Mai, R. Li, J. Jeong, D. Quispe, H. Kim, and S. Sanner, “Online Continual Learning in Image Classification: An Empirical Survey,” Neurocomputing 469 (2022): 28-51, https://doi.org/10.1016/j.neucom.2021.10.021.
|
| [16] |
D. W. Zhou, Q. W. Wang, Z. H. Qi, H. J. Ye, D. C. Zhan, and Z. Liu, “Class-Incremental Learning: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence 46, no. 12 (2024): 9851-9873, https://doi.org/10.1109/tpami.2024.3429383.
|
| [17] |
J. Kirkpatrick, R. Pascanu, N. Rabinowitz, et al., “Overcoming Catastrophic Forgetting in Neural Networks,” in Proceedings of the National Academy of Sciences 114, no. 13 (2017): 3521-3526, https://doi.org/10.1073/pnas.1611835114.
|
| [18] |
H. Lin, B. Zhang, S. Feng,X. Li, and Y. Ye, “PCR: Proxy-Based Contrastive Replay for Online Class-Incremental Continual Learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 24246-24255.
|
| [19] |
P. Khosla, P. Teterwak, C. Wang, et al. “Supervised Contrastive Learning ” Advances in Neural Information Processing Systems, 33 (2020): 18661-18673.
|
| [20] |
T. Chen, S. Kornblith,M. Norouzi, and G. Hinton, “A Simple Framework for Contrastive Learning of Visual Representations,” in Proceedings of the International Conference on Machine Learning (2020), 1597-1607.
|
| [21] |
Z. Chen, V. Badrinarayanan,C. Y. Lee, and A. Rabinovich, “Grad-Norm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks,” in Proceedings of the International Conference on Machine Learning (2018), 794-803.
|
| [22] |
J. Tan, X. Lu, G. Zhang,C. Yin, and Q. Li, “Equalization Loss v2:A New Gradient Balance Approach for Long-Tailed Object Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), 1685-1694.
|
| [23] |
X. Hao, S. Yang, R. Liu, Z. Feng, T. Peng, and B. Huang, “SMTC-CL: Continuous Learning via Selective Multi-Task Coordination for Adap-tive Signal Classification,” IEEE Transactions on Cognitive Communi-cations and Networking 11, no. 3 (2024): 1-1681, https://doi.org/10.1109/TCCN.2024.3485083.
|
| [24] |
Z. Li and D. Hoiem, “Learning Without Forgetting ” IEEE Trans-actions on Pattern Analysis and Machine Intelligence 40, no. 12 (2017): 2935-2947, https://doi.org/10.1109/tpami.2017.2773081.
|
| [25] |
P. Dhar, R. V. Singh, K. C. Peng, Z. Wu and R. Chellappa, “Learning Without Memorizing ” in Proceedings of the IEEE/CVF Con-ference on Computer Vision and Pattern Recognition (2019), 5138-5146.
|
| [26] |
C. V. Nguyen, Y. Li, T. D. Bui, and R. E. Turner, “Variational Continual Learning,” preprint, arXiv:1710.10628 (2017).
|
| [27] |
S. Ebrahimi, M. Elhoseiny,T. Darrell, and M. Rohrbach, “Uncertainty-Guided Continual Learning With Bayesian Neural Net-works,” in Proceedings of the International Conference on Learning Representations (2020).
|
| [28] |
F. Zenke, B. Poole and S. Ganguli, “Continual Learning Through Synaptic Intelligence,” in International Conference on Machine Learning (2017), 3987-3995.
|
| [29] |
M. Farajtabar, N. Azizan,A. Mott, and A. Li, “Orthogonal Gradient Descent for Continual Learning,” in Proceedings of the International Conference on Artificial Intelligence and Statistics (2020), 3762-3773.
|
| [30] |
J. Qiao, Z. Zhang, X. Tan, et al., “Prompt Gradient Projection for Continual Learning,” in The Twelfth International Conference on Learning Representations (2024).
|
| [31] |
H. Shin, J. K. Lee, J. Kim, and J. Kim, “Continual Learning With Deep Generative Replay,” Advances in Neural Information Processing Systems 30 (2017), 2990-2999.
|
| [32] |
A. Chaudhry, M. Rohrbach, M. Elhoseiny, et al., “On Tiny Episodic Memories in Continual Learning,” preprint, arXiv:1902.10486 (2019).
|
| [33] |
I. Goodfellow, J. Pouget-Abadie, M. Mirza, et al. “Generative Adversarial Nets ” Advances in Neural Information Processing Systems, 27 (2014): 2672-2680.
|
| [34] |
D. Lopez-Paz and M. Ranzato, “Gradient Episodic Memory for Continual Learning,” Advances in Neural Information Processing Sys-tems 30 (2017): 6467-6476.
|
| [35] |
A. Chaudhry, M. Ranzato,M. Rohrbach, and M. Elhoseiny, “Effi-cient Lifelong Learning With A-GEM,” in International Conference on Learning Representations (2019).
|
| [36] |
S. A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental Classifier and Representation Learning,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition 2017), 5533-5542.
|
| [37] |
M. Welling, “Herding Dynamical Weights to Learn,” in Proceedings of the 26th Annual International Conference on Machine Learning (2009), 1121-1128.
|
| [38] |
P. Buzzega, M. Boschini, A. Porrello, D. Abati, and S. Calderara, “Dark Experience for General Continual Learning: A Strong, Simple Baseline,” Advances in Neural Information Processing Systems 33 (2020): 15920-15930.
|
| [39] |
M. Riemer, I. Cases, R. Ajemian, et al., “Learning to Learn Without Forgetting by Maximizing Transfer and Minimizing Interference,” in Proceedings of the International Conference on Learning Representations (2019).
|
| [40] |
L. Wang, X. Zhang, K. Yang, et al., “Memory Replay With Data Compression for Continual Learning,” in Proceedings of the Interna-tional Conference on Learning Representations (2022).
|
| [41] |
R. Mohammed,J. Rawashdeh, and M. Abdullah, “Machine Learning With Oversampling and Undersampling Techniques: Over-view Study and Experimental Results,” in Proceedings of the 11th In-ternational Conference on Information and Communication Systems (2020), 243-248.
|
| [42] |
C. Yang, E. A. Fridgeirsson, J. A. Kors, J. M. Reps, and P. R. Rijn-beek, “Impact of Random Oversampling and Random Undersampling on the Performance of Prediction Models Developed Using Observa-tional Health Data,” Journal of Big Data 11, no. 1 (2024): 7, https://doi.org/10.1186/s40537-023-00857-7.
|
| [43] |
J. Liu, W. Ke, P. Wang, et al., “Fast and Continual Knowledge Graph Embedding via Incremental LoRA,” in Proceedings of the Inter-national Joint Conference on Artificial Intelligence (2024), 4566-4574, https://doi.org/10.24963/ijcai.2024/491.
|
| [44] |
X. Cao, H. Lu, L. Huang,X. Liu, and M. M. Cheng, “Generative Multi-Modal Models Are Good Class-Incremental Learners,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024): 28706-28717, https://doi.org/10.1109/cvpr52733.2024.02712.
|
| [45] |
H. Xu, J. Wasilewski, and B. Krawczyk, “Balanced Gradient Sample Retrieval for Enhanced Knowledge Retention in Proxy-Based Continual Learning,” preprint, arXiv:2412.14430 (2024).
|
| [46] |
H. Chen, Y. Wang, and Q. Hu, “Multi-Granularity Regularized Re-Balancing for Class Incremental Learning,” IEEE Transactions on Knowledge and Data Engineering 35, no. 7 (2022): 7263-7277, https://doi.org/10.1109/TKDE.2022.3188335.
|
| [47] |
Z. Huang, Z. Chen, Y. Li, et al., “Class Balance Matters to Active Class-Incremental Learning,” in Proceedings of the 32nd ACM Interna-tional Conference on Multimedia (2024): 9445-9454.
|
| [48] |
F. Graf, C. Hofer, M. Niethammer and R. Kwitt, “Dissecting Su-pervised Contrastive Learning,” in Proceedings of the International Conference on Machine Learning (2021), 3821-3830.
|
| [49] |
X. Yao, Y. Bai, X. Zhang, et al., “PCL: Proxy-Based Contrastive Learning for Domain Generalization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022), 7087-7097.
|
| [50] |
J. He, “Gradient Reweighting: Towards Imbalanced Class-Incremental Learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024): 16668-16677.
|
| [51] |
E. Francazi,M. Baity-Jesi, and A. Lucchi, “A Theoretical Analysis of the Learning Dynamics Under Class Imbalance,” in Proceedings of the International Conference on Machine Learning (2023), 10285-10322.
|
| [52] |
N. Díaz-Rodríguez, V. Lomonaco,D. Filliat, and D. Maltoni, “Don’t Forget, There Is More Than Forgetting: New Metrics for Continual Learning,” in Proceedings of the NeurIPS 2018 Workshop on Continual Learning 2018).
|
| [53] |
Y. Cui, M. Jia, T. Y. Lin,Y. Song, and S. Belongie, “Class-Balanced Loss Based on Effective Number of Samples,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019), 9268-9277.
|
Funding
National Natural Science Foundation of China(Grants 62406071)
National Natural Science Foundation of China(U21A20471)
Fujian Provincial Natural Science Foundation(Grant 2022J05135)