ShanghaiTech University Knowledge Management System
DEFEATnet-A Deep Conventional Image Representation for Image Classification | |
2016-03 | |
发表期刊 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (IF:8.3[JCR-2023],7.1[5-Year]) |
ISSN | 1051-8215 |
卷号 | 26期号:3页码:494-505 |
发表状态 | 已发表 |
DOI | 10.1109/TCSVT.2015.2389413 |
摘要 | To study underlying possibilities for the successes of conventional image representation and deep neural networks (DNNs) in image representation, we propose a deep feature extraction, encoding, and pooling network (DEFEATnet) architecture, which is a marriage between conventional image representation approaches and DNNs. In particular, in DEFEATnet, each layer consists of three components: feature extraction, feature encoding, and pooling. The primary advantage of DEFEATnet is twofold. First, it consolidates the prior knowledge (e.g., translation invariance) from extracting, encoding, and pooling handcrafted features, as in the conventional feature representation approaches. Second, it represents the object parts at different granularities by gradually increasing the local receptive fields in different layers, as in DNNs. Moreover, DEFEATnet is a generalized framework that can readily incorporate all types of local features as well as all kinds of well-designed feature encoding and pooling methods. Since prior knowledge is preserved in DEFEATnet, it is especially useful for image representation on small/medium size data sets, where DNNs usually fail due to the lack of sufficient training data. Promising experimental results clearly show that DEFEATnets outperform shallow conventional image representation approaches by a large margin when the same type of features, feature encoding and pooling are used. The extensive experiments also demonstrate the effectiveness of the deep architecture of our DEFEATnet in improving the robustness for image presentation. |
关键词 | Conventional image representation deep architecture feature encoding local max pooling |
URL | 查看原文 |
收录类别 | SCI ; EI |
语种 | 英语 |
资助项目 | National Science Foundation of China[61502304] |
WOS研究方向 | Engineering |
WOS类目 | Engineering, Electrical & Electronic |
WOS记录号 | WOS:000372547400006 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
EI入藏号 | 20161702307652 |
EI主题词 | Encoding (symbols) ; Extraction ; Feature extraction ; Network architecture |
EI分类号 | Data Processing and Image Processing:723.2 ; Chemical Operations:802.3 |
WOS关键词 | ALGORITHM ; FEATURES |
原始文献类型 | Article |
来源库 | IEEE |
引用统计 | 正在获取...
|
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/1911 |
专题 | 信息科学与技术学院_PI研究组_高盛华组 |
作者单位 | 1.ShanghaiTech University, Shanghai, China 2.Amazon, Seattle, WA, USA 3.University of Technology, Sydney, NSW, Australia |
第一作者单位 | 上海科技大学 |
第一作者的第一单位 | 上海科技大学 |
推荐引用方式 GB/T 7714 | Shenghua Gao,Lixin Duan,Ivor W. Tsang. DEFEATnet-A Deep Conventional Image Representation for Image Classification[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2016,26(3):494-505. |
APA | Shenghua Gao,Lixin Duan,&Ivor W. Tsang.(2016).DEFEATnet-A Deep Conventional Image Representation for Image Classification.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,26(3),494-505. |
MLA | Shenghua Gao,et al."DEFEATnet-A Deep Conventional Image Representation for Image Classification".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 26.3(2016):494-505. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。