Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

doi:10.1109/CVPR52688.2022.00829

	Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
	Hou, Yuenan 1; Zhu, Xinge 2; Ma, Yuexin3 ; Loy, Chen Change 4; Li, Yikang 1
	2022
会议录名称	PROCEEDINGS OF THE IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
ISSN	1063-6919
卷号	2022-June
页码	8469-8478
发表状态	已发表
DOI	10.1109/CVPR52688.2022.00829
摘要	This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation. Directly employing previous distillation approaches yields inferior results due to the intrinsic challenges of point cloud, i.e., sparsity, randomness and varying density. To tackle the aforementioned problems, we propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level. Specifically, we first leverage both the pointwise and voxelwise output distillation to complement the sparse supervision signals. Then, to better exploit the structural information, we divide the whole point cloud into several supervoxels and design a difficulty-aware sampling strategy to more frequently sample supervoxels containing less-frequent classes and faraway objects. On these supervoxels, we propose inter-point and inter-voxel affinity distillation, where the similarity information between points and voxels can help the student model better capture the structural information of the surrounding environment. We conduct extensive experiments on two popular LiDAR segmentation benchmarks, i.e., nuScenes [3] and SemanticKITTI [1]. On both benchmarks, our PVD-consistently outperforms previous distillation approaches by a large margin on three representative backbones, i.e., Cylinder3D [36], [37], SPVNAS [25] and MinkowskiNet [5]. Notably, on the challenging nuScenes and SemanticKITTI datasets, our method can achieve roughly 75% MACs reduction and 2× speedup on the competitive Cylinder3D model and rank 1st on the SemanticKITTI leaderboard among all published algorithms11https://competitions.codalab.org/competitions/20331#results (single-scan competition) till 2021-11-18 04:00 Pacific Time, and our method is termed Point-Voxel-KD. Our method (PV-KD) ranks 3rd on the multi-scan challenge till 2021-12-1 00:00 Pacific Time. Our code is available at https://github.com/cardwing/Codes-for-PVKD. © 2022 IEEE.
会议名称	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
出版地	10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA
会议地点	New Orleans, LA, United states
会议日期	June 19, 2022 - June 24, 2022
URL	查看原文
收录类别	EI ; CPCI-S
语种	英语
WOS研究方向	Computer Science ; Imaging Science & Photographic Technology
WOS类目	Computer Science, Artificial Intelligence ; Imaging Science & Photographic Technology
WOS记录号	WOS:000870759101051
出版者	IEEE Computer Society
EI入藏号	20224613120193
原始文献类型	Conference article (CA)
来源库	IEEE
引用统计	正在获取...
文献类型	会议论文
条目标识符	https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/248931
专题	信息科学与技术学院_PI研究组_马月昕
通讯作者	Hou, Yuenan
作者单位	1.Shanghai AI Lab, Shanghai, Peoples R China 2.Chinese Univ Hong Kong, Hong Kong, Peoples R China 3.ShanghaiTech Univ, Shanghai, Peoples R China 4.Nanyang Technol Univ, S Lab, Singapore, Singapore
推荐引用方式 GB/T 7714	Hou, Yuenan,Zhu, Xinge,Ma, Yuexin,et al. Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation[C]. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA:IEEE Computer Society,2022:8469-8478.