ShanghaiTech University Knowledge Management System
Improving training of deep neural networks via Singular Value Bounding | |
2017 | |
会议录名称 | 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) |
卷号 | 2017-January |
页码 | 3994-4002 |
发表状态 | 已发表 |
DOI | 10.1109/CVPR.2017.425 |
摘要 | Deep learning methods achieve great success recently on many computer vision problems. In spite of these practical successes, optimization of deep networks remains an active topic in deep learning research. In this work, we focus on investigation of the network solution properties that can potentially lead to good performance. Our research is inspired by theoretical and empirical results that use orthogonal matrices to initialize networks, but we are interested in investigating how orthogonal weight matrices perform when network training converges. To this end, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training, and achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB, all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improves Batch Normalization by removing its potential risk of ill-conditioned layer transform. We present both theoretical and empirical results to justify our proposed methods. Experiments on benchmark image classification datasets show the efficacy of our proposed SVB and BBN. In particular, we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets). Our preliminary results on ImageNet also show the promise in large- scale learning. We release the implementation code of our methods at www.aperture-lab.net/research/svb. |
出版地 | 345 E 47TH ST, NEW YORK, NY 10017 USA |
会议地点 | Honolulu, HI, United states |
收录类别 | CPCI ; EI |
语种 | 英语 |
资助项目 | Australian Research Council[FT-130101457] ; Australian Research Council[DP-140102164] ; Australian Research Council[LP-150100671] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000418371404009 |
出版者 | IEEE |
EI入藏号 | 20181304947356 |
EI主题词 | Bayesian networks ; Classification (of information) ; Computer vision ; Deep neural networks ; Matrix algebra ; Network architecture ; Pattern recognition |
EI分类号 | Information Theory and Signal Processing:716.1 ; Computer Applications:723.5 ; Algebra:921.1 ; Combinatorial Mathematics, Includes Graph Theory, Set Theory:921.4 |
原始文献类型 | Proceedings Paper |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/16318 |
专题 | 信息科学与技术学院_PI研究组_高盛华组 |
通讯作者 | Jia, Kui |
作者单位 | 1.South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Guangdong, Peoples R China 2.Univ Sydney, UBTech Sydney AI Inst, SIT, FEIT, Sydney, NSW, Australia 3.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China |
推荐引用方式 GB/T 7714 | Jia, Kui,Tao, Dacheng,Gao, Shenghua,et al. Improving training of deep neural networks via Singular Value Bounding[C]. 345 E 47TH ST, NEW YORK, NY 10017 USA:IEEE,2017:3994-4002. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。