Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning
Zhong JI, Jingwei NI, Xiyao LIU, Yanwei PANG
Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning
Although few-shot learning (FSL) has achieved great progress, it is still an enormous challenge especially when the source and target set are from different domains, which is also known as cross-domain few-shot learning (CD-FSL). Utilizing more source domain data is an effective way to improve the performance of CD-FSL. However, knowledge from different source domains may entangle and confuse with each other, which hurts the performance on the target domain. Therefore, we propose team-knowledge distillation networks (TKD-Net) to tackle this problem, which explores a strategy to help the cooperation of multiple teachers. Specifically, we distill knowledge from the cooperation of teacher networks to a single student network in a meta-learning framework. It incorporates task-oriented knowledge distillation and multiple cooperation among teachers to train an efficient student with better generalization ability on unseen tasks. Moreover, our TKD-Net employs both response-based knowledge and relation-based knowledge to transfer more comprehensive and effective knowledge. Extensive experimental results on four fine-grained datasets have demonstrated the effectiveness and superiority of our proposed TKD-Net approach.
cross-domain few-shot learning / meta-learning / knowledge distillation / multiple teachers
Zhong Ji received the PhD degree in signal and information processing from Tianjin University, China in 2008. He is currently a Professor with the School of Electrical and Information Engineering, Tianjin University, China. He has authored over 80 scientific papers. His current research interests include multimedia understanding, zero/few-shot learning, cross-modal analysis, and video summarization
Jingwei Ni received the BS degree in electronic and information engineering from Dalian University of Technology, China in 2019. She is currently pursuing the MS degree in the School of Electrical and Information Engineering, Tianjin University, China. Her current research interests include few-shot learning and computer vision
Xiyao Liu received the BS degree in telecommunication engineering from Tianjin University, China in 2015. She is currently pursuing a PhD degree in the School of Electrical and Information Engineering, Tianjin University, China. Her research interests include fewshot learning, human-object interaction, and computer vision
Yanwei Pang received the PhD degree in electronic engineering from the University of Science and Technology of China, China in 2004. He is currently a Professor with the School of Electrical and Information Engineering, Tianjin University, China. He has authored over 120 scientific papers. His current research interests include object detection and recognition, vision in bad weather, and computer vision
[1] |
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3637− 3645
|
[2] |
Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of the International Conference on Learning Representations. 2017, 1− 11
|
[3] |
Snell J, Swersky K, Zemel R S. Prototypical networks for few-shot learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4080− 4090
|
[4] |
Sung F, Yang Y, Zhang L, Xiang T, Torr P H S, Hospedales T M. Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1199− 1208
|
[5] |
Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1126− 1135
|
[6] |
Liu X Y, Wang S T, Zhang M L . Transfer synthetic over-sampling for class-imbalance learning with limited minority class data. Frontiers of Computer Science, 2019, 13( 5): 996– 1009
|
[7] |
Wang Y X, Hebert M. Learning to learn: model regression networks for easy small sample learning. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 616− 634
|
[8] |
Chen W Y, Liu Y C, Kira Z, Wang Y C F, Huang J B. A closer look at few-shot classification. In: Proceedings of the International Conference on Learning Representations. 2019, 1− 17
|
[9] |
Luo Z, Zou Y, Hoffman J, Li F F. Label efficient learning of transferable representations across domains and tasks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 164− 176
|
[10] |
Lu J, Cao Z, Wu K, Zhang G, Zhang C. Boosting few-shot image recognition via domain alignment prototypical networks. In: Proceedings of the 30th International Conference on Tools with Artificial Intelligence. 2018, 260− 264
|
[11] |
Ge H W, Han Y X, Kang W J, Sun L . Unpaired image to image transformation via informative coupled generative adversarial networks. Frontiers of Computer Science, 2021, 15( 4): 154326
|
[12] |
Liu L, Hamilton W L, Long G, Jiang J, Larochelle H. A universal representation transformer layer for few-shot image classification. In: Proceedings of the 9th International Conference on Learning Representations. 2021, 1− 11
|
[13] |
Tseng H Y, Lee H Y, Huang J B, Yang M H. Cross-domain few-shot classification via learned feature-wise transformation. In: Proceedings of the 8th International Conference on Learning Representations. 2020, 1− 18
|
[14] |
Dvornik N, Schmid C, Mairal J. Selecting relevant features from a multi-domain representation for few-shot classification. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 769− 786
|
[15] |
Triantafillou E, Zhu T, Dumoulin V, Lamblin P, Evci U, Xu K, Goroshin R, Gelada C, Swersky K, Manzagol P A, others
|
[16] |
He T, Shen C, Tian Z, Gong D, Sun C, Yan Y. Knowledge adaptation for efficient semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 578− 587
|
[17] |
Mukherjee S, Hassan Awadallah A. XtremeDistil: multi-stage distillation for massive multilingual models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 2221− 2234
|
[18] |
Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015, arXiv preprint arXiv: 1503.02531
|
[19] |
Peng B, Jin X, Li D, Zhou S, Wu Y, Liu J, Zhang Z, Liu Y. Correlation congruence for knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, 5006− 5015
|
[20] |
Nichol A, Achiam J, Schulman J. On first-order meta-learning algorithms. 2018, arXiv preprint arXiv: 1803.02999
|
[21] |
Rusu A A, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R. Meta-learning with latent embedding optimization. In: Proceedings of the 7th International Conference on Learning Representations. 2019, 1− 17
|
[22] |
Chen X, Dai H, Li Y, Gao X, Song L. Learning to stop while learning to predict. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1520− 1530
|
[23] |
Ji Z, Liu X, Pang Y, Ouyang W, Li X . Few-shot human-object interaction recognition with semantic-guided attentive prototypes network. IEEE Transactions on Image Processing, 2021, 30: 1648– 1661
|
[24] |
Tian Y, Wang Y, Krishnan D, Tenenbaum J B, Isola P. Rethinking few-shot image classification: a good embedding is all you need? In: Proceedings of the 16th European Conference on Computer Vision. 2020, 266− 282
|
[25] |
Wang Y X, Girshick R, Hebert M, Hariharan B. Low-shot learning from imaginary data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7278− 7286
|
[26] |
Li K, Zhang Y, Li K, Fu Y. Adversarial feature hallucination networks for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 13467− 13476
|
[27] |
Chen Z, Fu Y, Wang Y X, Ma L, Liu W, Hebert M. Image deformation meta-networks for one-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 8672− 8681
|
[28] |
Zhang H, Zhang J, Koniusz P. Few-shot learning via saliency-guided hallucination of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2765− 2774
|
[29] |
Yang C, Xie L, Qiao S, Yuille A L. Training deep neural networks in generations: a more tolerant teacher educates better students. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 5628− 5635
|
[30] |
Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: hints for thin deep nets. In: Proceedings of the 3rd International Conference on Learning Representations. 2015, 1− 13
|
[31] |
Zagoruyko S, Komodakis N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: Proceedings of the 5th International Conference on Learning Representations. 2017, 1− 13
|
[32] |
Yim J, Joo D, Bae J, Kim J. A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 7130− 7138
|
[33] |
Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B. Efficient knowledge distillation from an ensemble of teachers. In: Proceedings of the 18th Annual Conference of the International Speech Communication Association. 2017, 3697− 3701
|
[34] |
Zhou Z H, Jiang Y, Chen S F . Extracting symbolic rules from trained neural network ensembles. AI Communications, 2003, 16( 1): 3– 15
|
[35] |
Zhou Z H, Jiang Y . NeC4. 5: neural ensemble based C4.5. IEEE Transactions on Knowledge and Data Engineering, 2004, 16( 6): 770– 773
|
[36] |
Ruder S, Ghaffari P, Breslin J G. Knowledge adaptation: teaching to adapt. 2017, arXiv preprint arXiv: 1702.02052
|
[37] |
Li N, Yu Y, Zhou Z H. Diversity regularized ensemble pruning. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 330− 345
|
[38] |
Deng J, Dong W, Socher R, Li L J, Li K, Li F F. ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248− 255
|
[39] |
Wah C, Branson S, Welinder P, Perona P, Belongie S. The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001. Pasadena: California Institute of Technology, 2011
|
[40] |
Hilliard N, Phillips L, Howland S, Yankov A, Corley C D, Hodas N O. Few-shot learning with metric-agnostic conditional embeddings. 2018, arXiv preprint arXiv: 1802.04376
|
[41] |
Krause J, Stark M, Deng J, Li F F. 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops. 2013, 554− 561
|
[42] |
Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A . Places: a 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40( 6): 1452– 1464
|
[43] |
Van Horn G, Mac Aodha O, Song Y, Cui Y, Sun C, Shepard A, Adam H, Perona P, Belongie S. The iNaturalist species classification and detection dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8769− 8778
|
/
〈 | 〉 |