Locality-constrained spatial transformer network for video crowd counting
Fang, Yanyan1; Zhan, Biyun1; Cai, Wandi1; Gao, Shenghua2; Hu, Bo1
2019
会议录名称2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2019
ISSN1945788X
卷号2019-July
页码814-819
发表状态已发表
DOI10.1109/ICME.2019.00145
摘要Compared with single image based crowd counting, video provides the spatial-temporal information of the crowd that would help improve the robustness of crowd counting. But translation, rotation and scaling of people lead to the change of density map of heads between neighbouring frames. Meanwhile, people walking in/out or being occluded in dynamic scenes leads to the change of head counts. To alleviate these issues in video crowd counting, a Locality-constrained Spatial Transformer Network (LSTN) is proposed. Specifically, we first leverage a Convolutional Neural Networks to estimate the density map for each frame. Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame. To facilitate the performance evaluation, a large-scale video crowd counting dataset is collected, which contains 15K frames with about 394K annotated heads captured from 13 different scenes. As far as we know, it is the largest video crowd counting dataset. Extensive experiments on our dataset and other crowd counting datasets validate the effectiveness of our LSTN for crowd counting. All our dataset are released in https://github.com/sweetyy83/Lstn-fdst-dataset.
© 2019 IEEE.
会议地点Shanghai, China
URL查看原文
收录类别EI ; CPCI-S ; CPCI
WOS研究方向Computer Science ; Engineering
WOS类目Computer Science, Software Engineering ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS记录号WOS:000501820600137
出版者IEEE Computer Society
EI入藏号20193407349276
EI主题词Convolution ; Image enhancement ; Neural networks
EI分类号Information Theory and Signal Processing:716.1
原始文献类型Conference article (CA)
引用统计
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/89386
专题信息科学与技术学院_PI研究组_高盛华组
通讯作者Hu, Bo
作者单位1.School of information and technology, Fudan University, China
2.ShanghaiTech University, China
推荐引用方式
GB/T 7714
Fang, Yanyan,Zhan, Biyun,Cai, Wandi,et al. Locality-constrained spatial transformer network for video crowd counting[C]:IEEE Computer Society,2019:814-819.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Fang, Yanyan]的文章
[Zhan, Biyun]的文章
[Cai, Wandi]的文章
百度学术
百度学术中相似的文章
[Fang, Yanyan]的文章
[Zhan, Biyun]的文章
[Cai, Wandi]的文章
必应学术
必应学术中相似的文章
[Fang, Yanyan]的文章
[Zhan, Biyun]的文章
[Cai, Wandi]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.1109@ICME.2019.00145.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。