Cascaded ConvLSTMs Using Semantically-Coherent Data Synthesis for Video Object Segmentation

doi:10.1109/ACCESS.2019.2940768

	Cascaded ConvLSTMs Using Semantically-Coherent Data Synthesis for Video Object Segmentation
	Jia Zheng; Weixin Luo; Zhixin Piao
	2019
发表期刊	IEEE ACCESS (IF:3.4[JCR-2023],3.7[5-Year])
ISSN	2169-3536
卷号	7 页码:132120-132129
发表状态	已发表
DOI	10.1109/ACCESS.2019.2940768
摘要	This paper proposes a simple yet effective and efficient method for video object segmentation. Most existing methods take the color image and the optical flow as input for discovering the salient object in terms of appearance and motion. We instead leverage a ResNet backbone as an appearance-characterization encoder for each frame at different scales, and a series of Convolutional Long Short-Term Memory units (ConvLSTMs) as a motion-modeling decoder at each corresponding scale. By imposing supervision over each scale, such modules can well tackle all scales of a moving object with an inevitable scale variance over time. Instead of following a Condition Random Fields based post-processing, we use a more effective and efficient cascade module to refine the model predictions. Most existing video object segmentation datasets have limited sizes because it is expensive and time-consuming to obtain pixel-wise annotations. To overcome the data-insufficiency issue when training the deep network, we propose a semantically-coherent data synthesis strategy to augment training sequences without any efforts. Extensive experiments and ablation studies on the DAVIS 2016 dataset validate our proposed method. Furthermore, our method without the cascade module achieves a real-time speed of 26 fps on a single GPU.
关键词	Object segmentation Optical imaging Decoding Training Optical network units Video sequences Adaptive optics
URL	查看原文
收录类别	SCI ; SCIE ; EI
WOS类目	Computer Science, Information Systems ; Engineering, Electrical & Electronic ; Telecommunications
WOS记录号	WOS:000498627400003
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
EI入藏号	20200308052927
EI主题词	Brain ; Convolution ; Long short-term memory ; Motion compensation
EI分类号	Biomedical Engineering:461.1 ; Information Theory and Signal Processing:716.1
原始文献类型	Article
引用统计	正在获取...
文献类型	期刊论文
条目标识符	https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/102119
专题	信息科学与技术学院_硕士生信息科学与技术学院_博士生
作者单位	School of Information Science and Technology, ShanghaiTech University, Shanghai, China
第一作者单位	信息科学与技术学院
第一作者的第一单位	信息科学与技术学院
推荐引用方式 GB/T 7714	Jia Zheng,Weixin Luo,Zhixin Piao. Cascaded ConvLSTMs Using Semantically-Coherent Data Synthesis for Video Object Segmentation[J]. IEEE ACCESS,2019,7:132120-132129.
APA	Jia Zheng,Weixin Luo,&Zhixin Piao.(2019).Cascaded ConvLSTMs Using Semantically-Coherent Data Synthesis for Video Object Segmentation.IEEE ACCESS,7,132120-132129.
MLA	Jia Zheng,et al."Cascaded ConvLSTMs Using Semantically-Coherent Data Synthesis for Video Object Segmentation".IEEE ACCESS 7(2019):132120-132129.