消息
×
loading..
AS-MLP: AN AXIAL SHIFTED MLP ARCHITECTURE FOR VISION
2022
会议录名称ICLR 2022 - 10TH INTERNATIONAL CONFERENCE ON LEARNING REPRESENTATIONS
发表状态已发表
摘要

An Axial Shifted MLP architecture (AS-MLP) is proposed in this paper. Different from MLP-Mixer, where the global spatial feature is encoded for information flow through matrix transposition and one token-mixing MLP, we pay more attention to the local features interaction. By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different axial directions, which captures the local dependencies. Such an operation enables us to utilize a pure MLP architecture to achieve the same local receptive field as CNN-like architecture. We can also design the receptive field size and dilation of blocks of AS-MLP, etc, in the same spirit of convolutional neural networks. With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset. Such a simple yet effective architecture outperforms all MLP-based architectures and achieves competitive performance compared to the transformer-based architectures (e.g., Swin Transformer) even with slightly lower FLOPs. In addition, AS-MLP is also the first MLP-based architecture to be applied to the downstream tasks (e.g., object detection and semantic segmentation). The experimental results are also impressive. Our proposed AS-MLP obtains 51.5 mAP on the COCO validation set and 49.5 MS mIoU on the ADE20K dataset, which is competitive compared to the transformer-based architectures. Our AS-MLP establishes a strong baseline of MLP-based architecture. Code is available at https://github.com/svip-lab/AS-MLP. © 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.

会议录编者/会议主办者ByteDance ; et al. ; Meta AI ; Microsoft ; Qualcomm ; Sea Al Lab
关键词Architecture Convolutional neural networks Network architecture Object detection Semantic Segmentation Axial direction Feature interactions Feature map Flowthrough Information flows Local feature Matrix transposition Receptive field sizes Receptive fields Spatial features
会议名称10th International Conference on Learning Representations, ICLR 2022
会议地点Virtual, Online
会议日期April 25, 2022 - April 29, 2022
收录类别EI
语种英语
出版者International Conference on Learning Representations, ICLR
EI入藏号20231213775870
EI主题词Semantics
EI分类号402 Buildings and Towers - 723.2 Data Processing and Image Processing - 723.4 Artificial Intelligence
原始文献类型Conference article (CA)
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/294852
专题信息科学与技术学院_博士生
信息科学与技术学院_PI研究组_高盛华组
信息科学与技术学院_硕士生
作者单位
1.ShanghaiTech University, China;
2.Youtu Lab, Tencent;
3.Shanghai Engineering Research Center of Intelligent Vision and Imaging, China;
4.Shanghai Engineering Research Center of Energy Efficient and Custom AI IC, China
第一作者单位上海科技大学
第一作者的第一单位上海科技大学
推荐引用方式
GB/T 7714
Lian, Dongze,Yu, Zehao,Sun, Xing,et al. AS-MLP: AN AXIAL SHIFTED MLP ARCHITECTURE FOR VISION[C]//ByteDance, et al., Meta AI, Microsoft, Qualcomm, Sea Al Lab:International Conference on Learning Representations, ICLR,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Lian, Dongze]的文章
[Yu, Zehao]的文章
[Sun, Xing]的文章
百度学术
百度学术中相似的文章
[Lian, Dongze]的文章
[Yu, Zehao]的文章
[Sun, Xing]的文章
必应学术
必应学术中相似的文章
[Lian, Dongze]的文章
[Yu, Zehao]的文章
[Sun, Xing]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。