ShanghaiTech University Knowledge Management System
Temporal Collection and Distribution for Referring Video Object Segmentation | |
2023-10-06 | |
会议录名称 | 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)
![]() |
ISSN | 1550-5499 |
发表状态 | 已发表 |
DOI | 10.1109/ICCV51070.2023.01418 |
摘要 | Referring video object segmentation aims to segment a referent throughout a video sequence according to a natural language expression. It requires aligning the natural language expression with the objects’ motions and their dynamic associations at the global video level but segmenting objects at the frame level. To achieve this goal, we propose to simultaneously maintain a global referent token and a sequence of object queries, where the former is responsible for capturing video-level referent according to the language expression, while the latter serves to better locate and segment objects with each frame. Furthermore, to explicitly capture object motions and spatial-temporal cross-modal reasoning over objects, we propose a novel temporal collection-distribution mechanism for interacting between the global referent token and object queries. Specifically, the temporal collection mechanism collects global information for the referent token from object queries to the temporal motions to the language expression. In turn, the temporal distribution first distributes the referent token to the referent sequence across all frames and then performs efficient cross-frame reasoning between the referent sequence and object queries in every frame. Experimental results show that our method outperforms state-of-the-art methods on all benchmarks consistently and significantly. |
关键词 | Computer vision Motion segmentation Natural languages Video sequences Dynamics Object segmentation Benchmark testing |
会议地点 | Paris, France |
会议日期 | 1-6 Oct. 2023 |
URL | 查看原文 |
语种 | 英语 |
来源库 | IEEE |
引用统计 | 正在获取...
|
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/346078 |
专题 | 信息科学与技术学院 信息科学与技术学院_硕士生 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_杨思蓓组 |
通讯作者 | Yang SB(杨思蓓) |
作者单位 | 信息科学与技术学院 |
推荐引用方式 GB/T 7714 | Tang JJ,Zheng G,Yang SB. Temporal Collection and Distribution for Referring Video Object Segmentation[C],2023. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。