ShanghaiTech University Knowledge Management System
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets | |
2024-07-19 | |
发表期刊 | ACM TRANSACTIONS ON GRAPHICS (IF:7.8[JCR-2023],9.5[5-Year]) |
ISSN | 0730-0301 |
EISSN | 1557-7368 |
卷号 | 43期号:4 |
发表状态 | 已发表 |
DOI | 10.1145/3658146 |
摘要 | In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is often hampered by the limitations of existing digital tools, which demand extensive expertise and efforts. To narrow this disparity, we introduce CLAY, a 3D geometry and material generator designed to effortlessly transform human imagination into intricate 3D digital structures. CLAY supports classic text or image inputs as well as 3D-aware controls from diverse primitives (multi-view images, voxels, bounding boxes, point clouds, implicit representations, etc). At its core is a large-scale generative model composed of a multi-resolution Variational Autoencoder (VAE) and a minimalistic latent Diffusion Transformer (DiT), to extract rich 3D priors directly from a diverse range of 3D geometries. Specifically, it adopts neural fields to represent continuous and complete surfaces and uses a geometry generative module with pure transformer blocks in latent space. We present a progressive training scheme to train CLAY on an ultra large 3D model dataset obtained through a carefully designed processing pipeline, resulting in a 3D native geometry generator with 1.5 billion parameters. For appearance generation, CLAY sets out to produce physically-based rendering (PBR) textures by employing a multi-view material diffusion model that can generate 2K resolution textures with diffuse, roughness, and metallic modalities. We demonstrate using CLAY for a range of controllable 3D asset creations, from sketchy conceptual designs to production ready assets with intricate details. Even first time users can easily use CLAY to bring their vivid 3D imaginations to life, unleashing unlimited creativity. © 2024 Copyright held by the owner/author(s). |
关键词 | 3D modeling Diffusion Digital devices Geometry Interactive computer graphics Large datasets Rendering (computer graphics) Three dimensional computer graphics 3d asset generation 3D geometry Diffusion transformer Digital tools Generative model High quality Large-scale modeling Large-scales Multi modal control Physically based rendering |
URL | 查看原文 |
收录类别 | SCI ; EI |
语种 | 英语 |
资助项目 | National Key R&D Program of China[2022YFF0902301] ; NSFC programs[ |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Software Engineering |
WOS记录号 | WOS:001289270900087 |
出版者 | Association for Computing Machinery |
EI入藏号 | 20243016756662 |
EI主题词 | Textures |
EI分类号 | 723.2 Data Processing and Image Processing ; 723.5 Computer Applications ; 921 Mathematics |
原始文献类型 | Journal article (JA) |
引用统计 | 正在获取...
|
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/407194 |
专题 | 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_虞晶怡组 信息科学与技术学院_硕士生 信息科学与技术学院_PI研究组_许岚组 |
共同第一作者 | Wang, Ziyu |
通讯作者 | Xu, Lan; Yu, Jingyi |
作者单位 | 1.Shanghai Tech University, Shanghai, China and Deemos Technology Co., Ltd., Shanghai, China; 2.ShanghaiTech University and Deemos Technology Co., Ltd., Shanghai, China; 3.ShanghaiTech University, Shanghai, China; 4.Huazhong University of Science and Technology, Wuhan, China |
第一作者单位 | 上海科技大学 |
通讯作者单位 | 上海科技大学 |
第一作者的第一单位 | 上海科技大学 |
推荐引用方式 GB/T 7714 | Zhang, Longwen,Wang, Ziyu,Zhang, Qixuan,et al. CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets[J]. ACM TRANSACTIONS ON GRAPHICS,2024,43(4). |
APA | Zhang, Longwen.,Wang, Ziyu.,Zhang, Qixuan.,Qiu, Qiwei.,Pang, Anqi.,...&Yu, Jingyi.(2024).CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets.ACM TRANSACTIONS ON GRAPHICS,43(4). |
MLA | Zhang, Longwen,et al."CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets".ACM TRANSACTIONS ON GRAPHICS 43.4(2024). |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。