消息
×
loading..
SGTR: End-to-end Scene Graph Generation with Transformer
2022
会议录名称PROCEEDINGS OF THE IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
ISSN1063-6919
卷号2022-June
页码19464-19474
发表状态已发表
DOI10.1109/CVPR52688.2022.01888
摘要Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up two-stage or a point-based one-stage approach, which often suffers from high time complexity or sub-optimal designs. In this work, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To solve the problem, we develop a transformer-based end-to-end framework that first generates the entity and predicate proposal set, followed by inferring directed edges to form the relation triplets. In particular, we develop a new entity-aware predicate representation based on a structural predicate generator that leverages the compositional property of relationships. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on two challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference. We hope our model can serve as a strong baseline for the Transformer-based scene graph generation. 11Code is available: https://github.com/Scarecrow0/SGTR © 2022 IEEE.
会议名称2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
出版地10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA
会议地点New Orleans, LA, United states
会议日期June 19, 2022 - June 24, 2022
URL查看原文
收录类别EI ; CPCI-S
语种英语
资助项目Shanghai Science and Technology Program[21010502700]
WOS研究方向Computer Science ; Imaging Science & Photographic Technology
WOS类目Computer Science, Artificial Intelligence ; Imaging Science & Photographic Technology
WOS记录号WOS:000870783005029
出版者IEEE Computer Society
EI入藏号20224613120337
原始文献类型Conference article (CA)
引用统计
被引频次:63[WOS]   [WOS记录]     [WOS相关记录]
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/248933
专题信息科学与技术学院_博士生
信息科学与技术学院_PI研究组_何旭明组
通讯作者Li, Rongjie
作者单位
1.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
2.Shanghai Engn Res Ctr Intelligent Vis & Imaging, Shanghai, Peoples R China
3.Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, Beijing, Peoples R China
4.Univ Chinese Acad Sci, Beijing, Peoples R China
第一作者单位信息科学与技术学院
通讯作者单位信息科学与技术学院
第一作者的第一单位信息科学与技术学院
推荐引用方式
GB/T 7714
Li, Rongjie,Zhang, Songyang,He, Xuming. SGTR: End-to-end Scene Graph Generation with Transformer[C]. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA:IEEE Computer Society,2022:19464-19474.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[He, Xuming]的文章
百度学术
百度学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[He, Xuming]的文章
必应学术
必应学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[He, Xuming]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.1109@CVPR52688.2022.01888.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。