ShanghaiTech University Knowledge Management System
THE DEVIL IS IN THE OBJECT BOUNDARY: TOWARDS ANNOTATION-FREE INSTANCE SEGMENTATION USING FOUNDATION MODELS | |
2024 | |
会议录名称 | 12TH INTERNATIONAL CONFERENCE ON LEARNING REPRESENTATIONS, ICLR 2024 |
摘要 | Foundation models, pre-trained on a large amount of data have demonstrated impressive zero-shot capabilities in various downstream tasks. However, in object detection and instance segmentation, two fundamental computer vision tasks heavily reliant on extensive human annotations, foundation models such as SAM and DINO struggle to achieve satisfactory performance. In this study, we reveal that the devil is in the object boundary, i.e., these foundation models fail to discern boundaries between individual objects. For the first time, we probe that CLIP, which has never accessed any instance-level annotations, can provide a highly beneficial and strong instance-level boundary prior in the clustering results of its particular intermediate layer. Following this surprising observation, we propose Zip which Zips up CLip and SAM in a novel classification-first-then-discovery pipeline, enabling annotation-free, complex-scene-capable, open-vocabulary object detection and instance segmentation. Our Zip significantly boosts SAM's mask AP on COCO dataset by 12.5% and establishes state-of-the-art performance in various settings, including training-free, self-training, and label-efficient finetuning. Furthermore, annotation-free Zip even achieves comparable performance to the best-performing open-vocabulary object detecters using base annotations. Code is released at https://github.com/ChengShiest/Zip-Your-CLIP. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved. |
关键词 | Clustering algorithms Foundations Object recognition Zero-shot learning Clustering results Down-stream Foundation models Human annotations Individual objects Intermediate layers Large amounts of data Object boundaries Objects detection Performance |
会议名称 | 12th International Conference on Learning Representations, ICLR 2024 |
会议地点 | Hybrid, Vienna, Austria |
会议日期 | May 7, 2024 - May 11, 2024 |
收录类别 | EI |
语种 | 英语 |
出版者 | International Conference on Learning Representations, ICLR |
EI入藏号 | 20243216835513 |
EI主题词 | Object detection |
EI分类号 | 483.2 Foundations ; 723.2 Data Processing and Image Processing ; 903.1 Information Sources and Analysis |
原始文献类型 | Conference article (CA) |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/411256 |
专题 | 信息科学与技术学院 信息科学与技术学院_硕士生 信息科学与技术学院_PI研究组_杨思蓓组 |
通讯作者 | Yang, Sibei |
作者单位 | School of Information Science and Technology, ShanghaiTech University, China |
第一作者单位 | 信息科学与技术学院 |
通讯作者单位 | 信息科学与技术学院 |
第一作者的第一单位 | 信息科学与技术学院 |
推荐引用方式 GB/T 7714 | Shi, Cheng,Yang, Sibei. THE DEVIL IS IN THE OBJECT BOUNDARY: TOWARDS ANNOTATION-FREE INSTANCE SEGMENTATION USING FOUNDATION MODELS[C]:International Conference on Learning Representations, ICLR,2024. |
条目包含的文件 | ||||||
条目无相关文件。 |
个性服务 |
查看访问统计 |
谷歌学术 |
谷歌学术中相似的文章 |
[Shi, Cheng]的文章 |
[Yang, Sibei]的文章 |
百度学术 |
百度学术中相似的文章 |
[Shi, Cheng]的文章 |
[Yang, Sibei]的文章 |
必应学术 |
必应学术中相似的文章 |
[Shi, Cheng]的文章 |
[Yang, Sibei]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。