ShanghaiTech University Knowledge Management System
S2DM: Sector-Shaped Diffusion Models for Uniform Content Video Generation | |
2025-03-08 | |
会议录名称 | INTERNATIONAL CONFERENCE ON COMPUTER VISION 2025 |
发表状态 | 已投递待接收 |
摘要 | Diffusion models have achieved remarkable success in image generation. However, applying this concept to video generation introduces significant challenges, particularly in maintaining consistency and continuity throughout video frames. Existing approaches primarily address these challenges by incorporating spatiotemporal attention modules or additional temporal conditions. However, they often overlook the impact of non-shared noise between frames in the diffusion process, which can disrupt both semantic coherence and consistent stochastic details in the video. To tackle this problem, we introduce the Sector-Shaped Diffusion Model (S2DM), which employs a sector-shaped diffusion process with shared noise across frames under specific conditions. S2DM ensures that video frames maintain consistent semantic features and stochastic details, while preserving continuous temporal characteristics through guided conditions. We evaluate S2DM on various conditional video generation tasks, using optical flow or posture information as temporal conditions, and descriptive text or reference images as semantic conditions. Experimental results demonstrate that S2DM outperforms existing methods in generating videos with thematic coherence and smooth narrative progression. For text-to-video generation, where temporal conditions are not explicitly provided, we propose a three-step generation strategy that decouples the generation of temporal characteristics from semantic features. |
关键词 | Video Generation Diffuison Model |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/503661 |
专题 | 信息科学与技术学院_硕士生 创意与艺术学院_PI研究组(P)_田政组 |
通讯作者 | Tian Z(田政) |
作者单位 | 1.上海科技大学 2.中国科学院深圳先进技术研究院 |
第一作者单位 | 上海科技大学 |
通讯作者单位 | 上海科技大学 |
第一作者的第一单位 | 上海科技大学 |
推荐引用方式 GB/T 7714 | Lang HR,Ge YX,Zou SH,et al. S2DM: Sector-Shaped Diffusion Models for Uniform Content Video Generation[C],2025. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。