SGTR+: End-to-End Scene Graph Generation With Transformer
2024-04
发表期刊IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (IF:20.8[JCR-2023],22.2[5-Year])
ISSN1939-3539
EISSN1939-3539
卷号46期号:4页码:2191-2205
发表状态已发表
DOI10.1109/TPAMI.2023.3332246
摘要

Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. In this paper, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To address the issues above, we create a transformer-based end-to-end framework to generate the entity and entity-aware predicate proposal set, and infer directed edges to form relation triplets. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Based on bipartite graph assembling paradigm, we further propose a new technical design to address the efficacy of entity-aware modeling and optimization stability of graph assembling. Equipped with the enhanced entity-aware design, our method achieves optimal performance and time-complexity. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on three challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference. © 1979-2012 IEEE.

关键词Computer vision deep learning scene graph generation scene understanding visual relationship detection Benchmarking Deep learning Graph theory Graphic methods Job analysis Bipartite graphs Decoding Generator Graph generation Proposal Scene graph generation Scene understanding Scene-graphs Task analysis Transformer Visual relationship detection
URL查看原文
收录类别EI
语种英语
出版者IEEE Computer Society
EI入藏号20234715088289
EI主题词Computer vision
EI分类号461.4 Ergonomics and Human Factors Engineering ; 723.5 Computer Applications ; 741.2 Vision ; 921.4 Combinatorial Mathematics, Includes Graph Theory, Set Theory
原始文献类型Journal article (JA)
来源库IEEE
引用统计
正在获取...
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/347906
专题信息科学与技术学院_博士生
信息科学与技术学院_PI研究组_何旭明组
通讯作者He, Xuming
作者单位
1.ShanghaiTech University, Shanghai, China;
2.Shanghai AI Laboratory, Xuhui, Shanghai, China;
3.ShanghaiTech University, China
第一作者单位上海科技大学
通讯作者单位上海科技大学
第一作者的第一单位上海科技大学
推荐引用方式
GB/T 7714
Li, Rongjie,Zhang, Songyang,He, Xuming. SGTR+: End-to-End Scene Graph Generation With Transformer[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2024,46(4):2191-2205.
APA Li, Rongjie,Zhang, Songyang,&He, Xuming.(2024).SGTR+: End-to-End Scene Graph Generation With Transformer.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,46(4),2191-2205.
MLA Li, Rongjie,et al."SGTR+: End-to-End Scene Graph Generation With Transformer".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 46.4(2024):2191-2205.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[He, Xuming]的文章
百度学术
百度学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[He, Xuming]的文章
必应学术
必应学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[He, Xuming]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.1109@TPAMI.2023.3332246.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。