Stac: Spatial-temporal attention on compensation information for activity recognition in fpv
2021-02
发表期刊SENSORS (IF:3.4[JCR-2023],3.7[5-Year])
ISSN1424-8220
EISSN1424-8220
卷号21期号:4页码:1-19
DOI10.3390/s21041106
摘要Egocentric activity recognition in first-person video (FPV) requires fine-grained matching of the camera wearer’s action and the objects being operated. The traditional method used for third-person action recognition does not suffice because of (1) the background ego-noise introduced by the unstructured movement of the wearable devices caused by body movement; (2) the small-sized and fine-grained objects with single scale in FPV. Size compensation is performed to augment the data. It generates a multi-scale set of regions, including multi-size objects, leading to superior performance. We compensate for the optical flow to eliminate the camera noise in motion. We developed a novel two-stream convolutional neural network-recurrent attention neural network (CNN-RAN) architecture: spatial temporal attention on compensation information (STAC), able to generate generic descriptors under weak supervision and focus on the locations of activated objects and the capture of effective motion. We encode the RGB features using a spatial location-aware attention mechanism to guide the representation of visual features. Similar location-aware channel attention is applied to the temporal stream in the form of stacked optical flow to implicitly select the relevant frames and pay attention to where the action occurs. The two streams are complementary since one is object-centric and the other focuses on the motion. We conducted extensive ablation analysis to validate the complementarity and effectiveness of our STAC model qualitatively and quantitatively. It achieved state-of-the-art performance on two egocentric datasets. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
关键词Cameras Convolutional neural networks Location Optical flows Pattern recognition Radio access networks Action recognition Activity recognition Attention mechanisms Fine grained matching Spatial location Spatial temporals State of the art performance Wearable devices egocentric video analysis location-aware attention compensation information fine-grained activity recognition
URL查看原文
收录类别EI ; SCIE ; SCI
语种英语
WOS研究方向Chemistry ; Engineering ; Instruments & Instrumentation
WOS类目Chemistry, Analytical ; Engineering, Electrical & Electronic ; Instruments & Instrumentation
WOS记录号WOS:000624646800001
出版者MDPI AG
EI入藏号20210609895182
EI主题词Recurrent neural networks
EI分类号741.1 Light/Optics ; 742.2 Photographic Equipment
原始文献类型Journal article (JA)
引用统计
正在获取...
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/133286
专题信息科学与技术学院_博士生
通讯作者Sun, Shengli
作者单位
1.Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai; 200083, China;
2.School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing; 100049, China;
3.Key Laboratory of Intelligent Infrared Perception, Chinese Academy of Sciences, Shanghai; 200083, China;
4.School of Information Science and Technology, ShanghaiTech University, Shanghai; 200083, China
推荐引用方式
GB/T 7714
Zhang, Yue,Sun, Shengli,Lei, Linjian,et al. Stac: Spatial-temporal attention on compensation information for activity recognition in fpv[J]. SENSORS,2021,21(4):1-19.
APA Zhang, Yue,Sun, Shengli,Lei, Linjian,Liu, Huikai,&Xie, Hui.(2021).Stac: Spatial-temporal attention on compensation information for activity recognition in fpv.SENSORS,21(4),1-19.
MLA Zhang, Yue,et al."Stac: Spatial-temporal attention on compensation information for activity recognition in fpv".SENSORS 21.4(2021):1-19.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Zhang, Yue]的文章
[Sun, Shengli]的文章
[Lei, Linjian]的文章
百度学术
百度学术中相似的文章
[Zhang, Yue]的文章
[Sun, Shengli]的文章
[Lei, Linjian]的文章
必应学术
必应学术中相似的文章
[Zhang, Yue]的文章
[Sun, Shengli]的文章
[Lei, Linjian]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.3390@s21041106.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。