Prototype-guided cross-task knowledge distillation
Deng LI , Peng LI , Aming WU , Yahong HAN
Front. Inform. Technol. Electron. Eng ›› 2025, Vol. 26 ›› Issue (6) : 912 -929.
Prototype-guided cross-task knowledge distillation
Recently, large-scale pretrained models have revealed their benefits in various tasks. However, due to the enormous computation complexity and storage demands, it is challenging to apply large-scale models to real scenarios. Existing knowledge distillation methods require mainly the teacher model and the student model to share the same label space, which restricts their application in real scenarios. To alleviate the constraint of different label spaces, we propose a prototype-guided cross-task knowledge distillation (ProC-KD) method to migrate the intrinsic local-level object knowledge of the teacher network to various task scenarios. First, to better learn the generalized knowledge in cross-task scenarios, we present a prototype learning module to learn the invariant intrinsic local representation of objects from the teacher network. Second, for diverse downstream tasks, a task-adaptive feature augmentation module is proposed to enhance the student network features with the learned generalization prototype representations and guide the learning of the student network to improve its generalization ability. Experimental results on various visual tasks demonstrate the effectiveness of our approach for cross-task knowledge distillation scenarios.
Knowledge distillation / Cross-task / Prototype learning
Zhejiang University Press
/
| 〈 |
|
〉 |