Generative Modeling of Audible Shapes for Object Perception
2017
会议录名称2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)
ISSN2380-7504
卷号2017-October
页码1260-1269
发表状态已发表
DOI10.1109/ICCV.2017.141
摘要Humans infer rich knowledge of objects from both auditory and visual cues. Building a machine of such competency, however, is very challenging, due to the great difficulty in capturing large-scale, clean data of objects with both their appearance and the sound they make. In this paper, we present a novel, open-source pipeline that generates audiovisual data, purely from 3D object shapes and their physical properties. Through comparison with audio recordings and human behavioral studies, we validate the accuracy of the sounds it generates. Using this generative model, we are able to construct a synthetic audio-visual dataset, namely Sound-20K, for object perception tasks. We demonstrate that auditory and visual information play complementary roles in object perception, and further, that the representation learned on synthetic audio-visual data can transfer to real-world scenarios.
出版地345 E 47TH ST, NEW YORK, NY 10017 USA
会议地点Venice, Italy
会议日期22-29 Oct. 2017
URL查看原文
收录类别CPCI ; EI
语种英语
资助项目Center for Brain, Minds and Machines (NSF STC award)[CCF-1231216]
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS记录号WOS:000425498401034
出版者IEEE
EI入藏号20180704804044
EI主题词Behavioral research ; Computer vision
EI分类号Computer Applications:723.5 ; Acoustic Waves:751.1 ; Social Sciences:971
WOS关键词SOUNDS ; MOTION
原始文献类型Proceedings Paper
来源库IEEE
引用统计
正在获取...
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/16300
专题信息科学与技术学院
信息科学与技术学院_本科生
通讯作者Zhang, Zhoutong
作者单位
1.MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
2.Univ Cambridge, Cambridge, England
3.ShanghaiTech Univ, Shanghai, Peoples R China
4.Google Res, Mountain View, CA USA
推荐引用方式
GB/T 7714
Zhang, Zhoutong,Wu, Jiajun,Li, Qiujia,et al. Generative Modeling of Audible Shapes for Object Perception[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE,2017:1260-1269.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Zhang, Zhoutong]的文章
[Wu, Jiajun]的文章
[Li, Qiujia]的文章
百度学术
百度学术中相似的文章
[Zhang, Zhoutong]的文章
[Wu, Jiajun]的文章
[Li, Qiujia]的文章
必应学术
必应学术中相似的文章
[Zhang, Zhoutong]的文章
[Wu, Jiajun]的文章
[Li, Qiujia]的文章
相关权益政策
暂无数据
收藏/分享
文件名: ICCV_Paper.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。