Robust domain adaptation with noisy and shifted label distribution
Shao-Yuan LI, Shi-Ji ZHAO, Zheng-Tao CAO, Sheng-Jun HUANG, Songcan CHEN
Robust domain adaptation with noisy and shifted label distribution
Unsupervised Domain Adaptation (UDA) intends to achieve excellent results by transferring knowledge from labeled source domains to unlabeled target domains in which the data or label distribution changes. Previous UDA methods have acquired great success when labels in the source domain are pure. However, even the acquisition of scare clean labels in the source domain needs plenty of costs as well. In the presence of label noise in the source domain, the traditional UDA methods will be seriously degraded as they do not deal with the label noise. In this paper, we propose an approach named Robust Self-training with Label Refinement (RSLR) to address the above issue. RSLR adopts the self-training framework by maintaining a Labeling Network (LNet) on the source domain, which is used to provide confident pseudo-labels to target samples, and a Target-specific Network (TNet) trained by using the pseudo-labeled samples. To combat the effect of label noise, LNet progressively distinguishes and refines the mislabeled source samples. In combination with class re-balancing to combat the label distribution shift issue, RSLR achieves effective performance on extensive benchmark datasets.
unsupervised domain adaptation / label noise / label distribution shift / self-training / class rebalancing
Shao-Yuan Li is an associate professor in the College of Computer Science and Technology at Nanjing University of Aeronautics and Astronautics, China. She received BSc and PhD degrees in computer science from Nanjing University, China in 2010 and 2018. Her research interests include machine learning and data mining. She has won the Champion of PAKDD’12 Data Mining Challenge, the Best Paper Award of PRICAI’18, the 2nd place of Learning and Mining with Noisy Labels Challenge at IJCAI’22, and the 4th place of Continual Learning Challenge at CVPR’23
Shi-Ji Zhao received the BSc degree in computer science from Nanjing Agricultural University, China in 2022. Currently, he is working towards an MS degree in the College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, China. His research interests include Domain Adaptation and Test Time Adaptation
Zheng-Tao Cao received the BSc degree in computer science in 2020 from Shandong University of Technology, and the MS degree from Nanjing University of Aeronautics and Astronautics, China in 2023. His research interests include machine learning and domain adaptation
Sheng-Jun Huang received the BSc and PhD degrees in computer science from Nanjing University, China in 2008 and 2014, respectively. He is now a professor in the College of Computer Science and Technology at Nanjing University of Aeronautics and Astronautics, China. His main research interests include machine learning and data mining. He has been selected to the Young Elite Scientists Sponsorship Program by CAST in 2016, and won the China Computer Federation Outstanding Doctoral Dissertation Award in 2015, the KDD Best Poster Award in 2012, and the Microsoft Fellowship Award in 2011. He is a Junior Associate Editor of Frontiers of Computer Science
Songcan Chen received the BS degree in mathematics from Hangzhou University (now merged into Zhejiang University), China in 1983, and the MS degree in computer applications from Shanghai Jiao Tong University, China in 1985, and then worked with Nanjing University of Aeronautics and Astronautics (NUAA) , China in January 1986. He received the PhD degree in communication and information systems from NUAA in 1997. Since 1998, as a full-time professor, he has been with the College of Computer Science and Technology, NUAA. His research interests include pattern recognition, machine learning, and neural computing. He is also an IAPR fellow
[1] |
Wang M, Deng W . Deep visual domain adaptation: a survey. Neurocomputing, 2018, 312: 135–153
|
[2] |
Csurka G. A comprehensive survey on domain adaptation for visual applications. In: Csurka G, ed. Domain Adaptation in Computer Vision Applications. Cham: Springer, 2017, 1−35
|
[3] |
Perone C S, Ballester P, Barros R C, Cohen-Adad J . Unsupervised domain adaptation for medical imaging segmentation with self-ensembling. NeuroImage, 2019, 194: 1–11
|
[4] |
Ojha R, Sekhar C C. Unsupervised domain adaptation in speech recognition using phonetic features. 2021, arXiv preprint arXiv: 2108.02850
|
[5] |
Ben-David S, Blitzer J, Crammer K, Pereira F. Analysis of representations for domain adaptation. In: Schölkopf B, Platt J, Hofmann T, eds. Advances in Neural Information Processing Systems 19: Proceedings of 2006 Conference. Cambridge: MIT Press, 2007, 137−144
|
[6] |
Mansour Y, Mohri M, Rostamizadeh A. Domain adaptation: learning bounds and algorithms. 2009, arXiv preprint arXiv: 0902.3430
|
[7] |
Ghifary M, Kleijn W B, Zhang M. Domain adaptive neural networks for object recognition. In: Proceedings of the 13th Pacific Rim International Conference on Artificial Intelligence. 2014, 898−904
|
[8] |
Yan H, Ding Y, Li P, Wang Q, Xu Y, Zuo W. Mind the class weight bias: weighted maximum mean discrepancy for unsupervised domain adaptation. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 945−954
|
[9] |
Ganin Y, Lempitsky V. Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning. 2015, 1180−1189
|
[10] |
Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D. Domain separation networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 343−351
|
[11] |
Long M, Cao Y, Wang J, Jordan M I. Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd International Conference on Machine Learning. 2015, 97−105
|
[12] |
Taigman Y, Polyak A, Wolf L. Unsupervised cross-domain image generation. In: Proceedings of the International Conference on Learning Representations. 2017
|
[13] |
Hoffman J, Tzeng E, Park T, Zhu J Y, Isola P, Saenko K, Efros A, Darrell T. CyCADA: Cycle-consistent adversarial domain adaptation. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 1989−1998
|
[14] |
Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D. Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 95−104
|
[15] |
Li S, Xie M, Gong K, Liu C H, Wang Y, Li W. Transferable semantic augmentation for domain adaptation. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 11511−11520
|
[16] |
Saito K, Ushiku Y, Harada T. Asymmetric tri-training for unsupervised domain adaptation. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 2988−2997
|
[17] |
Prabhu V, Khare S, Kartik D, Hoffman J. SENTRY: selective entropy optimization via committee consistency for unsupervised domain adaptation. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. 2021, 8538−8547
|
[18] |
Sheng V S, Zhang J. Machine learning with crowdsourcing: a brief summary of the past research and future directions. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 9837−9843
|
[19] |
Liu F, Lu J, Han B, Niu G, Zhang G, Sugiyama M. Butterfly: a panacea for all difficulties in wildly unsupervised domain adaptation. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems Workshop. 2019
|
[20] |
Shu Y, Cao Z, Long M, Wang J. Transferable curriculum for weakly-supervised domain adaptation. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 4951−4958
|
[21] |
Han Z, Gui X J, Cui C, Yin Y. Towards accurate and robust domain adaptation under noisy environments. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2020, 2269−2276
|
[22] |
Xie R, Wei H, Feng L, An B. GearNet: stepwise dual learning for weakly supervised domain adaptation. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 8717−8725
|
[23] |
Gretton A, Borgwardt K M, Rasch M J, Schölkopf B, Smola A . A kernel two-sample test. The Journal of Machine Learning Research, 2012, 13: 723–773
|
[24] |
Lee J, Raginsky M. Minimax statistical learning with wasserstein distances. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 2692−2701
|
[25] |
Long M, Zhu H, Wang J, Jordan M I. Deep transfer learning with joint adaptation networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 2208−2217
|
[26] |
Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V . Domain-adversarial training of neural networks. The Journal of Machine Learning Research, 2016, 17( 1): 2096–2030
|
[27] |
Tzeng E, Hoffman J, Saenko K, Darrell T. Adversarial discriminative domain adaptation. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2962−2971
|
[28] |
Kundu J N, Kulkarni A R, Bhambri S, Mehta D, Kulkarni S A, Jampani V, Radhakrishnan V B. Balancing discriminability and transferability for source-free domain adaptation. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 11710−11728
|
[29] |
Liu H, Wang J, Long M. Cycle self-training for domain adaptation. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 22968−22981
|
[30] |
Arpit D, Jastrzębski S, Ballas N, Krueger D, Bengio E, Kanwal M S, Maharaj T, Fischer A, Courville A, Bengio Y, Lacoste-Julien S. A closer look at memorization in deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 233−242
|
[31] |
Han B, Yao Q, Yu X, Niu G, Xu M, Hu W, Tsang I W, Sugiyama M. Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8536−8546
|
[32] |
Arazo E, Ortego D, Albert P, O’Connor N, McGuinness K. Unsupervised label noise modeling and loss correction. In: Proceedings of the International Conference on Machine Learning. 2019, 312−321
|
[33] |
Li J, Socher R, Hoi S C H. DivideMix: learning with noisy labels as semi-supervised learning. In: Proceedings of the International Conference on Learning Representations. 2020
|
[34] |
Yu X, Han B, Yao J, Niu G, Tsang I, Sugiyama M. How does disagreement help generalization against label corruption? In: Proceedings of the 36th International Conference on Machine Learning. 2019, 7164−7173
|
[35] |
Yi L, Liu S, She Q, McLeod A I, Wang B. On learning contrastive representations for learning with noisy labels. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 16661−16670
|
[36] |
Tarvainen A, Valpola H. Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 1195−1204
|
[37] |
Permuter H, Francos J, Jermyn I . A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern Recognition, 2006, 39( 4): 695–706
|
[38] |
Cubuk E D, Zoph B, Shlens J, Le Q V. Randaugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2020, 3008−3017
|
[39] |
Patrini G, Rozza A, Krishna Menon A, Nock R, Qu L. Making deep neural networks robust to label noise: a loss correction approach. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2233−2241
|
[40] |
Jiang L, Zhou Z, Leung T, Li L J, Fei-Fei L. MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 2304−2313
|
[41] |
Long M, Zhu H, Wang J, Jordan M I. Unsupervised domain adaptation with residual transfer networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 136−144
|
[42] |
Zhang Y, Liu T, Long M, Jordan M. Bridging theory and algorithm for domain adaptation. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 7404−7413
|
[43] |
Sohn K, Berthelot D, Li C L, Zhang Z, Carlini N, Cubuk E D, Kurakin A, Zhang H, Raffel C. FixMatch: simplifying semi-supervised learning with consistency and confidence. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 51
|
[44] |
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T. DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st International Conference on Machine Learning. 2014, I-647−I-655
|
/
〈 | 〉 |