Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning

Zhong JI , Jingwei NI , Xiyao LIU , Yanwei PANG

Front. Comput. Sci. ›› 2023, Vol. 17 ›› Issue (2) : 172312

PDF (2932KB)
Front. Comput. Sci. ›› 2023, Vol. 17 ›› Issue (2) : 172312 DOI: 10.1007/s11704-022-1250-2
Artificial Intelligence
RESEARCH ARTICLE

Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning

Author information +
History +
PDF (2932KB)

Abstract

Although few-shot learning (FSL) has achieved great progress, it is still an enormous challenge especially when the source and target set are from different domains, which is also known as cross-domain few-shot learning (CD-FSL). Utilizing more source domain data is an effective way to improve the performance of CD-FSL. However, knowledge from different source domains may entangle and confuse with each other, which hurts the performance on the target domain. Therefore, we propose team-knowledge distillation networks (TKD-Net) to tackle this problem, which explores a strategy to help the cooperation of multiple teachers. Specifically, we distill knowledge from the cooperation of teacher networks to a single student network in a meta-learning framework. It incorporates task-oriented knowledge distillation and multiple cooperation among teachers to train an efficient student with better generalization ability on unseen tasks. Moreover, our TKD-Net employs both response-based knowledge and relation-based knowledge to transfer more comprehensive and effective knowledge. Extensive experimental results on four fine-grained datasets have demonstrated the effectiveness and superiority of our proposed TKD-Net approach.

Graphical abstract

Keywords

cross-domain few-shot learning / meta-learning / knowledge distillation / multiple teachers

Cite this article

Download citation ▾
Zhong JI, Jingwei NI, Xiyao LIU, Yanwei PANG. Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning. Front. Comput. Sci., 2023, 17(2): 172312 DOI:10.1007/s11704-022-1250-2

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3637− 3645

[2]

Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of the International Conference on Learning Representations. 2017, 1− 11

[3]

Snell J, Swersky K, Zemel R S. Prototypical networks for few-shot learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4080− 4090

[4]

Sung F, Yang Y, Zhang L, Xiang T, Torr P H S, Hospedales T M. Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1199− 1208

[5]

Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1126− 1135

[6]

Liu X Y, Wang S T, Zhang M L . Transfer synthetic over-sampling for class-imbalance learning with limited minority class data. Frontiers of Computer Science, 2019, 13( 5): 996– 1009

[7]

Wang Y X, Hebert M. Learning to learn: model regression networks for easy small sample learning. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 616− 634

[8]

Chen W Y, Liu Y C, Kira Z, Wang Y C F, Huang J B. A closer look at few-shot classification. In: Proceedings of the International Conference on Learning Representations. 2019, 1− 17

[9]

Luo Z, Zou Y, Hoffman J, Li F F. Label efficient learning of transferable representations across domains and tasks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 164− 176

[10]

Lu J, Cao Z, Wu K, Zhang G, Zhang C. Boosting few-shot image recognition via domain alignment prototypical networks. In: Proceedings of the 30th International Conference on Tools with Artificial Intelligence. 2018, 260− 264

[11]

Ge H W, Han Y X, Kang W J, Sun L . Unpaired image to image transformation via informative coupled generative adversarial networks. Frontiers of Computer Science, 2021, 15( 4): 154326

[12]

Liu L, Hamilton W L, Long G, Jiang J, Larochelle H. A universal representation transformer layer for few-shot image classification. In: Proceedings of the 9th International Conference on Learning Representations. 2021, 1− 11

[13]

Tseng H Y, Lee H Y, Huang J B, Yang M H. Cross-domain few-shot classification via learned feature-wise transformation. In: Proceedings of the 8th International Conference on Learning Representations. 2020, 1− 18

[14]

Dvornik N, Schmid C, Mairal J. Selecting relevant features from a multi-domain representation for few-shot classification. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 769− 786

[15]

Triantafillou E, Zhu T, Dumoulin V, Lamblin P, Evci U, Xu K, Goroshin R, Gelada C, Swersky K, Manzagol P A, others . Meta-dataset: a dataset of datasets for learning to learn from few examples. In: Proceedings of the 8th International Conference on Learning Representations. 2020, 1− 24

[16]

He T, Shen C, Tian Z, Gong D, Sun C, Yan Y. Knowledge adaptation for efficient semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 578− 587

[17]

Mukherjee S, Hassan Awadallah A. XtremeDistil: multi-stage distillation for massive multilingual models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 2221− 2234

[18]

Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015, arXiv preprint arXiv: 1503.02531

[19]

Peng B, Jin X, Li D, Zhou S, Wu Y, Liu J, Zhang Z, Liu Y. Correlation congruence for knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, 5006− 5015

[20]

Nichol A, Achiam J, Schulman J. On first-order meta-learning algorithms. 2018, arXiv preprint arXiv: 1803.02999

[21]

Rusu A A, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R. Meta-learning with latent embedding optimization. In: Proceedings of the 7th International Conference on Learning Representations. 2019, 1− 17

[22]

Chen X, Dai H, Li Y, Gao X, Song L. Learning to stop while learning to predict. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1520− 1530

[23]

Ji Z, Liu X, Pang Y, Ouyang W, Li X . Few-shot human-object interaction recognition with semantic-guided attentive prototypes network. IEEE Transactions on Image Processing, 2021, 30: 1648– 1661

[24]

Tian Y, Wang Y, Krishnan D, Tenenbaum J B, Isola P. Rethinking few-shot image classification: a good embedding is all you need? In: Proceedings of the 16th European Conference on Computer Vision. 2020, 266− 282

[25]

Wang Y X, Girshick R, Hebert M, Hariharan B. Low-shot learning from imaginary data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7278− 7286

[26]

Li K, Zhang Y, Li K, Fu Y. Adversarial feature hallucination networks for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 13467− 13476

[27]

Chen Z, Fu Y, Wang Y X, Ma L, Liu W, Hebert M. Image deformation meta-networks for one-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 8672− 8681

[28]

Zhang H, Zhang J, Koniusz P. Few-shot learning via saliency-guided hallucination of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2765− 2774

[29]

Yang C, Xie L, Qiao S, Yuille A L. Training deep neural networks in generations: a more tolerant teacher educates better students. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 5628− 5635

[30]

Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: hints for thin deep nets. In: Proceedings of the 3rd International Conference on Learning Representations. 2015, 1− 13

[31]

Zagoruyko S, Komodakis N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: Proceedings of the 5th International Conference on Learning Representations. 2017, 1− 13

[32]

Yim J, Joo D, Bae J, Kim J. A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 7130− 7138

[33]

Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B. Efficient knowledge distillation from an ensemble of teachers. In: Proceedings of the 18th Annual Conference of the International Speech Communication Association. 2017, 3697− 3701

[34]

Zhou Z H, Jiang Y, Chen S F . Extracting symbolic rules from trained neural network ensembles. AI Communications, 2003, 16( 1): 3– 15

[35]

Zhou Z H, Jiang Y . NeC4. 5: neural ensemble based C4.5. IEEE Transactions on Knowledge and Data Engineering, 2004, 16( 6): 770– 773

[36]

Ruder S, Ghaffari P, Breslin J G. Knowledge adaptation: teaching to adapt. 2017, arXiv preprint arXiv: 1702.02052

[37]

Li N, Yu Y, Zhou Z H. Diversity regularized ensemble pruning. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 330− 345

[38]

Deng J, Dong W, Socher R, Li L J, Li K, Li F F. ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248− 255

[39]

Wah C, Branson S, Welinder P, Perona P, Belongie S. The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001. Pasadena: California Institute of Technology, 2011

[40]

Hilliard N, Phillips L, Howland S, Yankov A, Corley C D, Hodas N O. Few-shot learning with metric-agnostic conditional embeddings. 2018, arXiv preprint arXiv: 1802.04376

[41]

Krause J, Stark M, Deng J, Li F F. 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops. 2013, 554− 561

[42]

Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A . Places: a 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40( 6): 1452– 1464

[43]

Van Horn G, Mac Aodha O, Song Y, Cui Y, Sun C, Shepard A, Adam H, Perona P, Belongie S. The iNaturalist species classification and detection dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8769− 8778

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (2932KB)

Supplementary files

FCS-21250-OF-ZJ_suppl_1

2167

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/