| |||||||
ShanghaiTech University Knowledge Management System
Pushing the Limit of Post-Training Quantization | |
2025 | |
发表期刊 | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (IF:20.8[JCR-2023],22.2[5-Year]) |
ISSN | 1939-3539 |
EISSN | 1939-3539 |
卷号 | PP期号:99 |
发表状态 | 已发表 |
DOI | 10.1109/TPAMI.2025.3554523 |
摘要 | Recently, post-training quantization (PTQ) has become the de facto way to produce efficient low-precision neural networks without long-time retraining. Despite its low cost, current PTQ works fail to succeed under the extremely low-bit setting. In this work, we delve into extremely low-bit quantization and construct a unified theoretical analysis, which provides an in-depth understanding of the reason for the failure of low-bit quantization. According to the theoretical study, we argue that the existing methods fail in low-bit schemes due to significant perturbation on weights and lack of consideration of activation quantization. To this end, we propose Brecq and QDrop to respectively solve these two challenges, based on which a Q-Limit framework is constructed. Then the Q-Limit framework is further extended to support a mixed precision quantization scheme. To the best of our knowledge, this is the first work that can push the limit of PTQ down to INT2. Extensive experiments on various handcrafted and searched neural architectures are conducted for both visual recognition/detection tasks and language processing tasks. Without bells and whistles, our PTQ framework can attain low-bit ResNet and MobileNetV2 comparable with quantization-aware training (QAT), establishing a new state-of-the-art for PTQ. Our code has been open-sourced at https://github.com/ModelTC/MQBench/. |
关键词 | 'current Block reconstruction Deep learning Flatness Low-costs Lower precision Model compression Neural-networks Post-training quantization Quantisation |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | IEEE Computer Society |
EI入藏号 | 20251318153413 |
EI主题词 | Deep reinforcement learning |
EI分类号 | 1101.2 Machine Learning - 1101.2.1 Deep Learning |
原始文献类型 | Article in Press |
来源库 | IEEE |
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/503722 |
专题 | 信息科学与技术学院_硕士生 |
作者单位 | 1.State Key Laboratory of Complex & Critical Software Environment, Beihang University, Beijing, China 2.Yale University, New Haven, CT, USA 3.ShanghaiTech University, Shanghai, China 4.State Key Laboratory of Complex & Critical Software Environment, Institute of Artificial Intelligence, Beihang University, Beijing, China |
推荐引用方式 GB/T 7714 | Ruihao Gong,Xianglong Liu,Yuhang Li,et al. Pushing the Limit of Post-Training Quantization[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2025,PP(99). |
APA | Ruihao Gong,Xianglong Liu,Yuhang Li,Yunqiang Fan,Xiuying Wei,&Jinyang Guo.(2025).Pushing the Limit of Post-Training Quantization.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,PP(99). |
MLA | Ruihao Gong,et al."Pushing the Limit of Post-Training Quantization".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE PP.99(2025). |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。