ShanghaiTech University Knowledge Management System
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers | |
2022-06-24 | |
会议录名称 | 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
![]() |
ISSN | 1063-6919 |
发表状态 | 已发表 |
DOI | 10.1109/CVPR52688.2022.01174 |
摘要 | Transformers are successfully applied to computer vision due to their powerful modeling capacity with self-attention. However, the excellent performance of transformers heavily depends on enormous training images. Thus, a data-efficient transformer solution is urgently needed. In this work, we propose an early knowledge distillation framework, which is termed as DearKD, to improve the data efficiency required by transformers. Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation. Further, our DearKD can be readily applied to the extreme data-free case where no real images are available. In this case, we propose a boundary-preserving intra-divergence loss based on DeepInversion to further close the performance gap against the full-data counterpart. Extensive experiments on ImageNet, partial ImageNet, data-free setting and other downstream tasks prove the superiority of DearKD over its baselines and state-of-the-art methods. |
关键词 | Deep learning architectures and techniques Optimization methods |
会议地点 | New Orleans, LA, USA |
会议日期 | 18-24 June 2022 |
URL | 查看原文 |
收录类别 | EI |
来源库 | IEEE |
引用统计 | 正在获取...
|
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/282041 |
专题 | 信息科学与技术学院_硕士生 信息科学与技术学院_PI研究组_高盛华组 |
作者单位 | 1.ShanghaiTech University, China; 2.Jd Explore Academy; 3.Meituan Inc.; 4.The University of Sydney, Australia; 5.Shanghai Engineering Research Center of Intelligent Vision and Imaging, China; 6.Shanghai Engineering Research Center of Energy Efficient and Custom Ai Ic, China |
第一作者单位 | 上海科技大学 |
第一作者的第一单位 | 上海科技大学 |
推荐引用方式 GB/T 7714 | Chen, Xianing,Cao, Qiong,Zhong, Yujie,et al. DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers[C],2022. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。