Fine-Grained Face Sketch-Photo Synthesis with Text-Guided Diffusion Models

doi:10.1007/978-3-031-47637-2_26

	Fine-Grained Face Sketch-Photo Synthesis with Text-Guided Diffusion Models
	Liu, Jin1,2 ; Huang, Huaibo 2; Cao, Jie 2; Duan, Junxian 2; He, Ran1,2
	2023
会议录名称	LECTURE NOTES IN COMPUTER SCIENCE (INCLUDING SUBSERIES LECTURE NOTES IN ARTIFICIAL INTELLIGENCE AND LECTURE NOTES IN BIOINFORMATICS)
ISSN	0302-9743
卷号	14407 LNCS
页码	340-354
发表状态	已发表
DOI	10.1007/978-3-031-47637-2_26
摘要	Face sketch-photo synthesis involves generating face photos from input face sketches. However, existing Generative Adversarial Networks (GANs)-based methods struggle to produce high-quality images due to artifacts and lack of detail caused by training difficulties. Additionally, prior approaches exhibit fixed and monotonous image styles, limiting practical usability. Drawing inspiration from recent successes in Diffusion Probability Models (DPMs) for image generation, we present a novel DPMs-based framework. This framework produces detailed face photos from input sketches while allowing control over facial attributes using textual descriptions. Our framework employs a U-Net, a semantic sketch encoder for extracting information from input sketches, and a text encoder to convert textual descriptions into text features. Furthermore, we incorporate a cross-attention mechanism within the U-Net to integrate text features. Experimental results demonstrate the effectiveness of our model, showcasing its ability to generate high-fidelity face photos while surpassing alternative methods in qualitative and quantitative evaluations. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.
关键词	Generative adversarial networks Image processing Semantics Signal encoding Diffusion model Face sketch-photo synthesis Fine grained High quality images Images synthesis Network-based Probability modelling Text feature Text-to-image synthesis Textual description
会议名称	7th Asian Conference on Pattern Recognition, ACPR 2023
会议地点	Kitakyushu, Japan
会议日期	November 5, 2023 - November 8, 2023
收录类别	EI
语种	英语
出版者	Springer Science and Business Media Deutschland GmbH
EI入藏号	20234715098664
EI主题词	Diffusion
EISSN	1611-3349
EI分类号	716.1 Information Theory and Signal Processing ; 723.2 Data Processing and Image Processing ; 723.4 Artificial Intelligence
原始文献类型	Conference article (CA)
文献类型	会议论文
条目标识符	https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/348725
专题	信息科学与技术学院_硕士生信息科学与技术学院
通讯作者	He, Ran
作者单位	1.School of Information Science and Technology, ShanghaiTech University, Shanghai, China 2.CRIPAC & MAIS, Institute of Automation, Chinese Academy of Sciences, Beijing, China
第一作者单位	信息科学与技术学院
通讯作者单位	信息科学与技术学院
第一作者的第一单位	信息科学与技术学院
推荐引用方式 GB/T 7714	Liu, Jin,Huang, Huaibo,Cao, Jie,et al. Fine-Grained Face Sketch-Photo Synthesis with Text-Guided Diffusion Models[C]:Springer Science and Business Media Deutschland GmbH,2023:340-354.