ShanghaiTech University Knowledge Management System
CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection | |
2023-09-03 | |
会议录名称 | ARXIV |
ISSN | 1550-5499 |
发表状态 | 已发表 |
DOI | arXiv:2309.01093 |
摘要 | Task driven object detection aims to detect object instances suitable for affording a task in an image. Its challenge lies in object categories available for the task being too diverse to be limited to a closed set of object vocabulary for traditional object detection. Simply mapping categories and visual features of common objects to the task cannot address the challenge. In this paper, we propose to explore fundamental affordances rather than object categories, i.e., common attributes that enable different objects to accomplish the same task. Moreover, we propose a novel multi-level chain-of-thought prompting (MLCoT) to extract the affordance knowledge from large language models, which contains multi-level reasoning steps from task to object examples to essential visual attributes with rationales. Furthermore, to fully exploit knowledge to benefit object recognition and localization, we propose a knowledge-conditional detection framework, namely CoTDet. It conditions the detector from the knowledge to generate object queries and regress boxes. Experimental results demonstrate that our CoTDet outperforms state-of-the-art methods consistently and significantly (+15.6 box AP and +14.8 mask AP) and can generate rationales for why objects are detected to afford the task. |
会议地点 | Paris, France |
会议日期 | 1-6 Oct. 2023 |
URL | 查看原文 |
资助项目 | National Natural Science Foundation of China[ |
WOS类目 | Computer Science, Software Engineering |
WOS记录号 | PPRN:84731264 |
来源库 | IEEE |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/348025 |
专题 | 信息科学与技术学院 信息科学与技术学院_PI研究组_虞晶怡组 信息科学与技术学院_硕士生 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_杨思蓓组 |
作者单位 | ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China |
第一作者单位 | 信息科学与技术学院 |
第一作者的第一单位 | 信息科学与技术学院 |
推荐引用方式 GB/T 7714 | Tang, Jiajin,Zheng, Ge,Yu, Jingyi,et al. CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection[C],2023. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。