| |||||||
ShanghaiTech University Knowledge Management System
SGTR: End-to-end Scene Graph Generation with Transformer | |
2022 | |
会议录名称 | PROCEEDINGS OF THE IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION |
ISSN | 1063-6919 |
卷号 | 2022-June |
页码 | 19464-19474 |
发表状态 | 已发表 |
DOI | 10.1109/CVPR52688.2022.01888 |
摘要 | Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up two-stage or a point-based one-stage approach, which often suffers from high time complexity or sub-optimal designs. In this work, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To solve the problem, we develop a transformer-based end-to-end framework that first generates the entity and predicate proposal set, followed by inferring directed edges to form the relation triplets. In particular, we develop a new entity-aware predicate representation based on a structural predicate generator that leverages the compositional property of relationships. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on two challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference. We hope our model can serve as a strong baseline for the Transformer-based scene graph generation. 11Code is available: https://github.com/Scarecrow0/SGTR © 2022 IEEE. |
会议名称 | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 |
出版地 | 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA |
会议地点 | New Orleans, LA, United states |
会议日期 | June 19, 2022 - June 24, 2022 |
URL | 查看原文 |
收录类别 | EI ; CPCI-S |
语种 | 英语 |
资助项目 | Shanghai Science and Technology Program[21010502700] |
WOS研究方向 | Computer Science ; Imaging Science & Photographic Technology |
WOS类目 | Computer Science, Artificial Intelligence ; Imaging Science & Photographic Technology |
WOS记录号 | WOS:000870783005029 |
出版者 | IEEE Computer Society |
EI入藏号 | 20224613120337 |
原始文献类型 | Conference article (CA) |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/248933 |
专题 | 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_何旭明组 |
通讯作者 | Li, Rongjie |
作者单位 | 1.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China 2.Shanghai Engn Res Ctr Intelligent Vis & Imaging, Shanghai, Peoples R China 3.Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, Beijing, Peoples R China 4.Univ Chinese Acad Sci, Beijing, Peoples R China |
第一作者单位 | 信息科学与技术学院 |
通讯作者单位 | 信息科学与技术学院 |
第一作者的第一单位 | 信息科学与技术学院 |
推荐引用方式 GB/T 7714 | Li, Rongjie,Zhang, Songyang,He, Xuming. SGTR: End-to-end Scene Graph Generation with Transformer[C]. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA:IEEE Computer Society,2022:19464-19474. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。