InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions
2024
发表期刊INTERNATIONAL JOURNAL OF COMPUTER VISION (IF:11.6[JCR-2023],14.5[5-Year])
ISSN0920-5691
EISSN1573-1405
卷号132期号:9页码:3463-3483
DOI10.1007/s11263-024-02042-6
摘要We have recently seen tremendous progress in diffusion advances for generating realistic human motions. Yet, they largely disregard the multi-human interactions. In this paper, we present InterGen, an effective diffusion-based approach that enables layman users to customize high-quality two-person interaction motions, with only text guidance. We first contribute a multimodal dataset, named InterHuman. It consists of about 107 M frames for diverse two-person interactions, with accurate skeletal motions and 23,337 natural language descriptions. For the algorithm side, we carefully tailor the motion diffusion model to our two-person interaction setting. To handle the symmetry of human identities during interactions, we propose two cooperative transformer-based denoisers that explicitly share weights, with a mutual attention mechanism to further connect the two denoising processes. Then, we propose a novel representation for motion input in our interaction diffusion model, which explicitly formulates the global relations between the two performers in the world frame. We further introduce two novel regularization terms to encode spatial relations, equipped with a corresponding damping scheme during the training of our interaction diffusion model. Extensive experiments validate the effectiveness of InterGen (https://tr3e.github.io/intergen-page/). Notably, it can generate more diverse and compelling two-person motions than previous methods and enables various downstream applications for human interactions. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
关键词Air navigation Diffusion model Effective diffusion Human motions Humaninteraction Motion generation Motion synthesis Multi-modal Multimodal generation Realistic human motion Text-driven generation
URL查看原文
收录类别EI
语种英语
出版者Springer
EI入藏号20241315818941
EI主题词Diffusion
EI分类号431.5 Air Navigation and Traffic Control
原始文献类型Article in Press
引用统计
正在获取...
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/359842
专题信息科学与技术学院_PI研究组_许岚组
信息科学与技术学院_PI研究组_虞晶怡组
信息科学与技术学院_硕士生
信息科学与技术学院_本科生
信息科学与技术学院_博士生
通讯作者Xu, Lan
作者单位
ShanghaiTech University, Shanghai, China
第一作者单位上海科技大学
通讯作者单位上海科技大学
第一作者的第一单位上海科技大学
推荐引用方式
GB/T 7714
Liang, Han,Zhang, Wenqian,Li, Wenxuan,et al. InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION,2024,132(9):3463-3483.
APA Liang, Han,Zhang, Wenqian,Li, Wenxuan,Yu, Jingyi,&Xu, Lan.(2024).InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions.INTERNATIONAL JOURNAL OF COMPUTER VISION,132(9),3463-3483.
MLA Liang, Han,et al."InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions".INTERNATIONAL JOURNAL OF COMPUTER VISION 132.9(2024):3463-3483.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Liang, Han]的文章
[Zhang, Wenqian]的文章
[Li, Wenxuan]的文章
百度学术
百度学术中相似的文章
[Liang, Han]的文章
[Zhang, Wenqian]的文章
[Li, Wenxuan]的文章
必应学术
必应学术中相似的文章
[Liang, Han]的文章
[Zhang, Wenqian]的文章
[Li, Wenxuan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.1007@s11263-024-02042-6.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。