DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

doi:10.1109/CVPR52688.2022.01174

	DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
	Chen, Xianing1 ; Cao, Qiong 2; Zhong, Yujie 3; Zhang, Jing 4; Gao, Shenghua1,5,6 ; Tao, Dacheng 2,4
	2022-06-24
会议录名称	2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
ISSN	1063-6919
发表状态	已发表
DOI	10.1109/CVPR52688.2022.01174
摘要	Transformers are successfully applied to computer vision due to their powerful modeling capacity with self-attention. However, the excellent performance of transformers heavily depends on enormous training images. Thus, a data-efficient transformer solution is urgently needed. In this work, we propose an early knowledge distillation framework, which is termed as DearKD, to improve the data efficiency required by transformers. Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation. Further, our DearKD can be readily applied to the extreme data-free case where no real images are available. In this case, we propose a boundary-preserving intra-divergence loss based on DeepInversion to further close the performance gap against the full-data counterpart. Extensive experiments on ImageNet, partial ImageNet, data-free setting and other downstream tasks prove the superiority of DearKD over its baselines and state-of-the-art methods.
关键词	Deep learning architectures and techniques Optimization methods
会议地点	New Orleans, LA, USA
会议日期	18-24 June 2022
URL	查看原文
收录类别	EI
来源库	IEEE
引用统计	正在获取...
文献类型	会议论文
条目标识符	https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/282041
专题	信息科学与技术学院_硕士生信息科学与技术学院_PI研究组_高盛华组
作者单位	1.ShanghaiTech University, China; 2.Jd Explore Academy; 3.Meituan Inc.; 4.The University of Sydney, Australia; 5.Shanghai Engineering Research Center of Intelligent Vision and Imaging, China; 6.Shanghai Engineering Research Center of Energy Efficient and Custom Ai Ic, China
第一作者单位	上海科技大学
第一作者的第一单位	上海科技大学
推荐引用方式 GB/T 7714	Chen, Xianing,Cao, Qiong,Zhong, Yujie,et al. DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers[C],2022.