ShanghaiTech University Knowledge Management System
CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention | |
2023-06-27 | |
会议录名称 | PROCEEDINGS OF THE 37TH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, AAAI 2023 |
ISSN | 2159-5399 |
卷号 | 37 |
页码 | 746-754 |
发表状态 | 已发表 |
摘要 | Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual representations with promising zero-shot performance. To further improve its downstream accuracy, existing works propose additional learnable modules upon CLIP and fine-tune them by few-shot training sets. However, the resulting extra training cost and data requirement severely hinder the efficiency for model deployment and knowledge transfer. In this paper, we introduce a free-lunch enhancement method, CALIP, to boost CLIP’s zero-shot performance via a parameter-free Attention module. Specifically, we guide visual and textual representations to interact with each other and explore cross-modal informative features via attention. As the pre-training has largely reduced the embedding distances between two modalities, we discard all learnable parameters in the attention and bidirectionally update the multi-modal features, enabling the whole process to be parameter-free and training-free. In this way, the images are blended with textual-aware signals and the text representations become visual-guided for better adaptive zero-shot alignment. We evaluate CALIP on various benchmarks of 14 datasets for both 2D image and 3D point cloud few-shot classification, showing consistent zero-shot performance improvement over CLIP. Based on that, we further insert a small number of linear layers in CALIP’s attention module and verify our robustness under the few-shot settings, which also achieves leading performance compared to existing methods. Those extensive experiments demonstrate the superiority of our approach for efficient enhancement of CLIP. Code is available at https://github.com/ZiyuGuo99/CALIP. Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. |
会议举办国 | Association for the Advancement of Artificial Intelligence |
会议录编者/会议主办者 | Association for the Advancement of Artificial Intelligence |
关键词 | Benchmarking Classification (of information) Knowledge management Visual languages Zero-shot learning Cost requirements Data requirements Down-stream Learn+ Performance Pre-training Training costs Training data Training sets Visual representations |
会议名称 | 37th AAAI Conference on Artificial Intelligence, AAAI 2023 |
出版地 | 2275 E BAYSHORE RD, STE 160, PALO ALTO, CA 94303 USA |
会议地点 | Washington, DC, United states |
会议日期 | February 7, 2023 - February 14, 2023 |
URL | 查看原文 |
收录类别 | EI ; CPCI-S |
语种 | 英语 |
资助项目 | NSFC["61832001","U22B2037"] |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods |
WOS记录号 | WOS:001243759700083 |
出版者 | AAAI Press |
EI入藏号 | 20233314552792 |
EI主题词 | Image enhancement |
EISSN | 2374-3468 |
EI分类号 | 716.1 Information Theory and Signal Processing ; 723.1.1 Computer Programming Languages ; 723.5 Computer Applications ; 903.1 Information Sources and Analysis ; 903.3 Information Retrieval and Use |
原始文献类型 | Conference article (CA) |
引用统计 | 正在获取...
|
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/325817 |
专题 | 信息科学与技术学院_硕士生 信息科学与技术学院_PI研究组_何旭明组 信息科学与技术学院_博士生 |
通讯作者 | Zhang, Renrui |
作者单位 | 1.School of CS and Key Lab of HCST, Peking University, China; 2.The Chinese University of Hong Kong, Hong Kong; 3.Shanghai AI Laboratory, China; 4.ShanghaiTech University, China; 5.Carnegie Mellon University, United States |
推荐引用方式 GB/T 7714 | Guo, Ziyu,Zhang, Renrui,Qiu, Longtian,et al. CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention[C]//Association for the Advancement of Artificial Intelligence. 2275 E BAYSHORE RD, STE 160, PALO ALTO, CA 94303 USA:AAAI Press,2023:746-754. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。