ShanghaiTech University Knowledge Management System
Locality-constrained spatial transformer network for video crowd counting | |
Fang, Yanyan1; Zhan, Biyun1; Cai, Wandi1; Gao, Shenghua2; Hu, Bo1 | |
2019 | |
会议录名称 | 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2019 |
ISSN | 1945788X |
卷号 | 2019-July |
页码 | 814-819 |
发表状态 | 已发表 |
DOI | 10.1109/ICME.2019.00145 |
摘要 | Compared with single image based crowd counting, video provides the spatial-temporal information of the crowd that would help improve the robustness of crowd counting. But translation, rotation and scaling of people lead to the change of density map of heads between neighbouring frames. Meanwhile, people walking in/out or being occluded in dynamic scenes leads to the change of head counts. To alleviate these issues in video crowd counting, a Locality-constrained Spatial Transformer Network (LSTN) is proposed. Specifically, we first leverage a Convolutional Neural Networks to estimate the density map for each frame. Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame. To facilitate the performance evaluation, a large-scale video crowd counting dataset is collected, which contains 15K frames with about 394K annotated heads captured from 13 different scenes. As far as we know, it is the largest video crowd counting dataset. Extensive experiments on our dataset and other crowd counting datasets validate the effectiveness of our LSTN for crowd counting. All our dataset are released in https://github.com/sweetyy83/Lstn-fdst-dataset. © 2019 IEEE. |
会议地点 | Shanghai, China |
URL | 查看原文 |
收录类别 | EI ; CPCI-S ; CPCI |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Software Engineering ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000501820600137 |
出版者 | IEEE Computer Society |
EI入藏号 | 20193407349276 |
EI主题词 | Convolution ; Image enhancement ; Neural networks |
EI分类号 | Information Theory and Signal Processing:716.1 |
原始文献类型 | Conference article (CA) |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/89386 |
专题 | 信息科学与技术学院_PI研究组_高盛华组 |
通讯作者 | Hu, Bo |
作者单位 | 1.School of information and technology, Fudan University, China 2.ShanghaiTech University, China |
推荐引用方式 GB/T 7714 | Fang, Yanyan,Zhan, Biyun,Cai, Wandi,et al. Locality-constrained spatial transformer network for video crowd counting[C]:IEEE Computer Society,2019:814-819. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。