Pushing the Limit of Post-Training Quantization

doi:10.1109/TPAMI.2025.3554523

	Pushing the Limit of Post-Training Quantization
	Ruihao Gong 1; Xianglong Liu 1; Yuhang Li 2; Yunqiang Fan3 ; Xiuying Wei 1; Jinyang Guo 4
	2025
发表期刊	IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (IF:20.8[JCR-2023],22.2[5-Year])
ISSN	1939-3539
EISSN	1939-3539
卷号	PP 期号:99
发表状态	已发表
DOI	10.1109/TPAMI.2025.3554523
摘要	Recently, post-training quantization (PTQ) has become the de facto way to produce efficient low-precision neural networks without long-time retraining. Despite its low cost, current PTQ works fail to succeed under the extremely low-bit setting. In this work, we delve into extremely low-bit quantization and construct a unified theoretical analysis, which provides an in-depth understanding of the reason for the failure of low-bit quantization. According to the theoretical study, we argue that the existing methods fail in low-bit schemes due to significant perturbation on weights and lack of consideration of activation quantization. To this end, we propose Brecq and QDrop to respectively solve these two challenges, based on which a Q-Limit framework is constructed. Then the Q-Limit framework is further extended to support a mixed precision quantization scheme. To the best of our knowledge, this is the first work that can push the limit of PTQ down to INT2. Extensive experiments on various handcrafted and searched neural architectures are conducted for both visual recognition/detection tasks and language processing tasks. Without bells and whistles, our PTQ framework can attain low-bit ResNet and MobileNetV2 comparable with quantization-aware training (QAT), establishing a new state-of-the-art for PTQ. Our code has been open-sourced at https://github.com/ModelTC/MQBench/.
关键词	'current Block reconstruction Deep learning Flatness Low-costs Lower precision Model compression Neural-networks Post-training quantization Quantisation
URL	查看原文
收录类别	EI
语种	英语
出版者	IEEE Computer Society
EI入藏号	20251318153413
EI主题词	Deep reinforcement learning
EI分类号	1101.2 Machine Learning - 1101.2.1 Deep Learning
原始文献类型	Article in Press
来源库	IEEE
文献类型	期刊论文
条目标识符	https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/503722
专题	信息科学与技术学院_硕士生
作者单位	1.State Key Laboratory of Complex & Critical Software Environment, Beihang University, Beijing, China 2.Yale University, New Haven, CT, USA 3.ShanghaiTech University, Shanghai, China 4.State Key Laboratory of Complex & Critical Software Environment, Institute of Artificial Intelligence, Beihang University, Beijing, China
推荐引用方式 GB/T 7714	Ruihao Gong,Xianglong Liu,Yuhang Li,et al. Pushing the Limit of Post-Training Quantization[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2025,PP(99).
APA	Ruihao Gong,Xianglong Liu,Yuhang Li,Yunqiang Fan,Xiuying Wei,&Jinyang Guo.(2025).Pushing the Limit of Post-Training Quantization.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,PP(99).
MLA	Ruihao Gong,et al."Pushing the Limit of Post-Training Quantization".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE PP.99(2025).