消息
×
loading..
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
2024-04-24
会议录名称ARXIV
ISSN1063-6919
发表状态已发表
DOIarXiv:2404.00906
摘要

Scene graph generation (SGG) aims to parse a visual scene into an intermediate graph representation for downstream reasoning tasks. Despite recent advancements, existing methods struggle to generate scene graphs with novel visual relation concepts. To address this challenge, we introduce a new open-vocabulary SGG framework based on sequence generation. Our framework leverages vision-language pre-trained models (VLM) by incorporating an image-to-graph generation paradigm. Specifically, we generate scene graph sequences via image-to-text generation with VLM and then construct scene graphs from these sequences. By doing so, we harness the strong capabilities of VLM for open-vocabulary SGG and seamlessly integrate explicit relational modeling for enhancing the VL tasks. Experimental results demonstrate that our design not only achieves superior performance with an open vocabulary but also enhances downstream vision-language task performance through explicit relation modeling knowledge.

会议地点Seattle, WA, USA
会议日期16-22 June 2024
URL查看原文
资助项目NSFC[
WOS类目Computer Science, Software Engineering
WOS记录号PPRN:88360823
来源库IEEE
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/372923
专题信息科学与技术学院
信息科学与技术学院_PI研究组_何旭明组
信息科学与技术学院_博士生
通讯作者Li, Rongjie
作者单位
1.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
2.Shanghai AI Lab, Shanghai, Peoples R China
3.Shanghai Engn Res Ctr Intelligent Vis & Imaging, Shanghai, Peoples R China
第一作者单位信息科学与技术学院
通讯作者单位信息科学与技术学院
第一作者的第一单位信息科学与技术学院
推荐引用方式
GB/T 7714
Li, Rongjie,Zhang, Songyang,Lin, Dahua,et al. From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models[C],2024.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[Lin, Dahua]的文章
百度学术
百度学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[Lin, Dahua]的文章
必应学术
必应学术中相似的文章
[Li, Rongjie]的文章
[Zhang, Songyang]的文章
[Lin, Dahua]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。