CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention
2023-06-27
会议录名称PROCEEDINGS OF THE 37TH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, AAAI 2023
ISSN2159-5399
卷号37
页码746-754
发表状态已发表
摘要

Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual representations with promising zero-shot performance. To further improve its downstream accuracy, existing works propose additional learnable modules upon CLIP and fine-tune them by few-shot training sets. However, the resulting extra training cost and data requirement severely hinder the efficiency for model deployment and knowledge transfer. In this paper, we introduce a free-lunch enhancement method, CALIP, to boost CLIP’s zero-shot performance via a parameter-free Attention module. Specifically, we guide visual and textual representations to interact with each other and explore cross-modal informative features via attention. As the pre-training has largely reduced the embedding distances between two modalities, we discard all learnable parameters in the attention and bidirectionally update the multi-modal features, enabling the whole process to be parameter-free and training-free. In this way, the images are blended with textual-aware signals and the text representations become visual-guided for better adaptive zero-shot alignment. We evaluate CALIP on various benchmarks of 14 datasets for both 2D image and 3D point cloud few-shot classification, showing consistent zero-shot performance improvement over CLIP. Based on that, we further insert a small number of linear layers in CALIP’s attention module and verify our robustness under the few-shot settings, which also achieves leading performance compared to existing methods. Those extensive experiments demonstrate the superiority of our approach for efficient enhancement of CLIP. Code is available at https://github.com/ZiyuGuo99/CALIP. Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

会议举办国Association for the Advancement of Artificial Intelligence
会议录编者/会议主办者Association for the Advancement of Artificial Intelligence
关键词Benchmarking Classification (of information) Knowledge management Visual languages Zero-shot learning Cost requirements Data requirements Down-stream Learn+ Performance Pre-training Training costs Training data Training sets Visual representations
会议名称37th AAAI Conference on Artificial Intelligence, AAAI 2023
出版地2275 E BAYSHORE RD, STE 160, PALO ALTO, CA 94303 USA
会议地点Washington, DC, United states
会议日期February 7, 2023 - February 14, 2023
URL查看原文
收录类别EI ; CPCI-S
语种英语
资助项目NSFC["61832001","U22B2037"]
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods
WOS记录号WOS:001243759700083
出版者AAAI Press
EI入藏号20233314552792
EI主题词Image enhancement
EISSN2374-3468
EI分类号716.1 Information Theory and Signal Processing ; 723.1.1 Computer Programming Languages ; 723.5 Computer Applications ; 903.1 Information Sources and Analysis ; 903.3 Information Retrieval and Use
原始文献类型Conference article (CA)
引用统计
正在获取...
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/325817
专题信息科学与技术学院_硕士生
信息科学与技术学院_PI研究组_何旭明组
信息科学与技术学院_博士生
通讯作者Zhang, Renrui
作者单位
1.School of CS and Key Lab of HCST, Peking University, China;
2.The Chinese University of Hong Kong, Hong Kong;
3.Shanghai AI Laboratory, China;
4.ShanghaiTech University, China;
5.Carnegie Mellon University, United States
推荐引用方式
GB/T 7714
Guo, Ziyu,Zhang, Renrui,Qiu, Longtian,et al. CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention[C]//Association for the Advancement of Artificial Intelligence. 2275 E BAYSHORE RD, STE 160, PALO ALTO, CA 94303 USA:AAAI Press,2023:746-754.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Guo, Ziyu]的文章
[Zhang, Renrui]的文章
[Qiu, Longtian]的文章
百度学术
百度学术中相似的文章
[Guo, Ziyu]的文章
[Zhang, Renrui]的文章
[Qiu, Longtian]的文章
必应学术
必应学术中相似的文章
[Guo, Ziyu]的文章
[Zhang, Renrui]的文章
[Qiu, Longtian]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。