ShanghaiTech University Knowledge Management System
A High-Throughput Full-Dataflow MobileNetv2 Accelerator on Edge FPGA | |
2022 | |
发表期刊 | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS (IF:2.7[JCR-2023],2.9[5-Year]) |
ISSN | 0278-0070 |
EISSN | 1937-4151 |
卷号 | 42期号:5页码:1-1 |
发表状态 | 已发表 |
DOI | 10.1109/TCAD.2022.3198246 |
摘要 | FPGA accelerators for lightweight neural networks such as MobileNetv2 are of great need in edge computing applications with high throughput requirements. Dataflow architecture has been considered a promising approach to optimize throughput since the intermediate feature map transfers can be significantly saved. However, previous MobileNetv2 accelerators only achieved a partial-dataflow architecture, and just one-third of the feature map transfers can be saved. To solve this issue, we propose a scheme to achieve a full-dataflow MobileNetv2 accelerator on FPGA. The scheme contains four techniques. First, we improve the full-integer quantization for easier deployment on hardware. Second, we propose tunable activation weight imbalance transfer for less quantization accuracy loss. Third, we present several highly optimized accelerator components whose parallelism can be flexibly adjusted, and implement residual connection with deeper FIFO so that the requirements of the full-dataflow architecture can be fully met. Finally, we present a computing resource allocation strategy to balance the latency of each layer, and a memory resource allocation strategy to effectively use the on-chip memory. Compared to the state-ofthe-art, experimental results show that the accelerator achieves 1910 FPS with 1.8 speedup when implemented on the Xilinx ZCU102 FPGA. In addition, it reaches 72.98% Top-1 accuracy with 8-bit integer quantization that outperforms all the other MobileNetv2 accelerators. IEEE |
关键词 | Acceleration Field programmable gate arrays (FPGA) Memory architecture Network architecture Parallel architectures Resource allocation Data-flow architectures Dataflow Feature map Field programmable gate array Field programmables High-throughput Parallel processing Programmable gate array Quantization (signal) Resource management |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | Institute of Electrical and Electronics Engineers Inc. |
EI入藏号 | 20223512670805 |
EI主题词 | Quantization (signal) |
EI分类号 | 713.3 Modulators, Demodulators, Limiters, Discriminators, Mixers ; 721.2 Logic Elements ; 722 Computer Systems and Equipment ; 912.2 Management |
原始文献类型 | Article in Press |
来源库 | IEEE |
引用统计 | 正在获取...
|
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/226410 |
专题 | 信息科学与技术学院 信息科学与技术学院_PI研究组_哈亚军组 信息科学与技术学院_博士生 |
作者单位 | 1.School of Information Science and Technology, ShanghaiTech University, Shanghai, China 2.School of Computer Science, University of Nottingham Ningbo China, Ningbo, China 3.School of Information Science and Technology and the Shanghai Engineering Research Center of Energy Efficient and Custom AI IC, ShanghaiTech University, Shanghai, China |
第一作者单位 | 信息科学与技术学院 |
第一作者的第一单位 | 信息科学与技术学院 |
推荐引用方式 GB/T 7714 | Weixiong Jiang,Heng Yu,Yajun Ha. A High-Throughput Full-Dataflow MobileNetv2 Accelerator on Edge FPGA[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2022,42(5):1-1. |
APA | Weixiong Jiang,Heng Yu,&Yajun Ha.(2022).A High-Throughput Full-Dataflow MobileNetv2 Accelerator on Edge FPGA.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,42(5),1-1. |
MLA | Weixiong Jiang,et al."A High-Throughput Full-Dataflow MobileNetv2 Accelerator on Edge FPGA".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 42.5(2022):1-1. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
个性服务 |
查看访问统计 |
谷歌学术 |
谷歌学术中相似的文章 |
[Weixiong Jiang]的文章 |
[Heng Yu]的文章 |
[Yajun Ha]的文章 |
百度学术 |
百度学术中相似的文章 |
[Weixiong Jiang]的文章 |
[Heng Yu]的文章 |
[Yajun Ha]的文章 |
必应学术 |
必应学术中相似的文章 |
[Weixiong Jiang]的文章 |
[Heng Yu]的文章 |
[Yajun Ha]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。