THE DEVIL IS IN THE OBJECT BOUNDARY: TOWARDS ANNOTATION-FREE INSTANCE SEGMENTATION USING FOUNDATION MODELS
2024
会议录名称12TH INTERNATIONAL CONFERENCE ON LEARNING REPRESENTATIONS, ICLR 2024
摘要Foundation models, pre-trained on a large amount of data have demonstrated impressive zero-shot capabilities in various downstream tasks. However, in object detection and instance segmentation, two fundamental computer vision tasks heavily reliant on extensive human annotations, foundation models such as SAM and DINO struggle to achieve satisfactory performance. In this study, we reveal that the devil is in the object boundary, i.e., these foundation models fail to discern boundaries between individual objects. For the first time, we probe that CLIP, which has never accessed any instance-level annotations, can provide a highly beneficial and strong instance-level boundary prior in the clustering results of its particular intermediate layer. Following this surprising observation, we propose Zip which Zips up CLip and SAM in a novel classification-first-then-discovery pipeline, enabling annotation-free, complex-scene-capable, open-vocabulary object detection and instance segmentation. Our Zip significantly boosts SAM's mask AP on COCO dataset by 12.5% and establishes state-of-the-art performance in various settings, including training-free, self-training, and label-efficient finetuning. Furthermore, annotation-free Zip even achieves comparable performance to the best-performing open-vocabulary object detecters using base annotations. Code is released at https://github.com/ChengShiest/Zip-Your-CLIP. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.
关键词Clustering algorithms Foundations Object recognition Zero-shot learning Clustering results Down-stream Foundation models Human annotations Individual objects Intermediate layers Large amounts of data Object boundaries Objects detection Performance
会议名称12th International Conference on Learning Representations, ICLR 2024
会议地点Hybrid, Vienna, Austria
会议日期May 7, 2024 - May 11, 2024
收录类别EI
语种英语
出版者International Conference on Learning Representations, ICLR
EI入藏号20243216835513
EI主题词Object detection
EI分类号483.2 Foundations ; 723.2 Data Processing and Image Processing ; 903.1 Information Sources and Analysis
原始文献类型Conference article (CA)
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/411256
专题信息科学与技术学院
信息科学与技术学院_硕士生
信息科学与技术学院_PI研究组_杨思蓓组
通讯作者Yang, Sibei
作者单位
School of Information Science and Technology, ShanghaiTech University, China
第一作者单位信息科学与技术学院
通讯作者单位信息科学与技术学院
第一作者的第一单位信息科学与技术学院
推荐引用方式
GB/T 7714
Shi, Cheng,Yang, Sibei. THE DEVIL IS IN THE OBJECT BOUNDARY: TOWARDS ANNOTATION-FREE INSTANCE SEGMENTATION USING FOUNDATION MODELS[C]:International Conference on Learning Representations, ICLR,2024.
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Shi, Cheng]的文章
[Yang, Sibei]的文章
百度学术
百度学术中相似的文章
[Shi, Cheng]的文章
[Yang, Sibei]的文章
必应学术
必应学术中相似的文章
[Shi, Cheng]的文章
[Yang, Sibei]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。