ShanghaiTech University Knowledge Management System
InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions | |
2024 | |
发表期刊 | INTERNATIONAL JOURNAL OF COMPUTER VISION (IF:11.6[JCR-2023],14.5[5-Year]) |
ISSN | 0920-5691 |
EISSN | 1573-1405 |
卷号 | 132期号:9页码:3463-3483 |
DOI | 10.1007/s11263-024-02042-6 |
摘要 | We have recently seen tremendous progress in diffusion advances for generating realistic human motions. Yet, they largely disregard the multi-human interactions. In this paper, we present InterGen, an effective diffusion-based approach that enables layman users to customize high-quality two-person interaction motions, with only text guidance. We first contribute a multimodal dataset, named InterHuman. It consists of about 107 M frames for diverse two-person interactions, with accurate skeletal motions and 23,337 natural language descriptions. For the algorithm side, we carefully tailor the motion diffusion model to our two-person interaction setting. To handle the symmetry of human identities during interactions, we propose two cooperative transformer-based denoisers that explicitly share weights, with a mutual attention mechanism to further connect the two denoising processes. Then, we propose a novel representation for motion input in our interaction diffusion model, which explicitly formulates the global relations between the two performers in the world frame. We further introduce two novel regularization terms to encode spatial relations, equipped with a corresponding damping scheme during the training of our interaction diffusion model. Extensive experiments validate the effectiveness of InterGen (https://tr3e.github.io/intergen-page/). Notably, it can generate more diverse and compelling two-person motions than previous methods and enables various downstream applications for human interactions. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. |
关键词 | Air navigation Diffusion model Effective diffusion Human motions Humaninteraction Motion generation Motion synthesis Multi-modal Multimodal generation Realistic human motion Text-driven generation |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | Springer |
EI入藏号 | 20241315818941 |
EI主题词 | Diffusion |
EI分类号 | 431.5 Air Navigation and Traffic Control |
原始文献类型 | Article in Press |
引用统计 | 正在获取...
|
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/359842 |
专题 | 信息科学与技术学院_PI研究组_许岚组 信息科学与技术学院_PI研究组_虞晶怡组 信息科学与技术学院_硕士生 信息科学与技术学院_本科生 信息科学与技术学院_博士生 |
通讯作者 | Xu, Lan |
作者单位 | ShanghaiTech University, Shanghai, China |
第一作者单位 | 上海科技大学 |
通讯作者单位 | 上海科技大学 |
第一作者的第一单位 | 上海科技大学 |
推荐引用方式 GB/T 7714 | Liang, Han,Zhang, Wenqian,Li, Wenxuan,et al. InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION,2024,132(9):3463-3483. |
APA | Liang, Han,Zhang, Wenqian,Li, Wenxuan,Yu, Jingyi,&Xu, Lan.(2024).InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions.INTERNATIONAL JOURNAL OF COMPUTER VISION,132(9),3463-3483. |
MLA | Liang, Han,et al."InterGen: Diffusion-Based Multi-human Motion Generation Under Complex Interactions".INTERNATIONAL JOURNAL OF COMPUTER VISION 132.9(2024):3463-3483. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。