SIRI: Spatial relation induced network for spatial description resolution
2020
会议录名称ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
ISSN1049-5258
卷号2020-December
发表状态已发表
DOI未知
摘要

Spatial Description Resolution, as a language-guided localization task, is proposed for target location in a panoramic street view, given corresponding language descriptions. Explicitly characterizing an object-level relationship while distilling spatial relationships are currently absent but crucial to this task. Mimicking humans, who sequentially traverse spatial relationship words and objects with a first-person view to locate their target, we propose a novel spatial relationship induced (SIRI) network. Specifically, visual features are firstly correlated at an implicit object-level in a projected latent space; then they are distilled by each spatial relationship word, resulting in each differently activated feature representing each spatial relationship. Further, we introduce global position priors to fix the absence of positional information, which may result in global positional reasoning ambiguities. Both the linguistic and visual features are concatenated to finalize the target localization. Experimental results on the Touchdown show that our method is around 24% better than the state-of-the-art method in terms of accuracy, measured by an 80-pixel radius. Our method also generalizes well on our proposed extended dataset collected using the same settings as Touchdown. The code for this project is publicly available at https://github.com/wong-puiyiu/siri-sdr. © 2020 Neural information processing systems foundation. All rights reserved.

会议录编者/会议主办者Apple ; et al. ; Microsoft ; PDT Partners ; Sony ; Tenstorrent
关键词Language description Positional information Spatial descriptions Spatial relations Spatial relationships State-of-the-art methods Target localization Target location
会议名称34th Conference on Neural Information Processing Systems, NeurIPS 2020
出版地10010 NORTH TORREY PINES RD, LA JOLLA, CALIFORNIA 92037 USA
会议地点Virtual, Online
会议日期December 6, 2020 - December 12, 2020
URL查看原文
收录类别EI ; CPCI-S
语种英语
资助项目National Key RD Program of China[2018AAA0100704] ; NSFC["61932020","61773272"] ; Science and Technology Commission of Shanghai Municipality[20ZR1436000]
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence ; Computer Science, Information Systems
WOS记录号WOS:001207690605048
出版者Neural information processing systems foundation
EI入藏号20212610553993
EI分类号722.1 Data Storage, Equipment and Techniques
原始文献类型Conference article (CA)
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/251874
专题信息科学与技术学院_博士生
信息科学与技术学院_PI研究组_高盛华组
作者单位
1.ShanghaiTech University, China;
2.Dalian University of Technology, China;
3.Shanghai University, China;
4.Soochow Univerisity, China
第一作者单位上海科技大学
第一作者的第一单位上海科技大学
推荐引用方式
GB/T 7714
Wang, Peiyao,Luo, Weixin,Xu, Yanyu,et al. SIRI: Spatial relation induced network for spatial description resolution[C]//Apple, et al., Microsoft, PDT Partners, Sony, Tenstorrent. 10010 NORTH TORREY PINES RD, LA JOLLA, CALIFORNIA 92037 USA:Neural information processing systems foundation,2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Wang, Peiyao]的文章
[Luo, Weixin]的文章
[Xu, Yanyu]的文章
百度学术
百度学术中相似的文章
[Wang, Peiyao]的文章
[Luo, Weixin]的文章
[Xu, Yanyu]的文章
必应学术
必应学术中相似的文章
[Wang, Peiyao]的文章
[Luo, Weixin]的文章
[Xu, Yanyu]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。