ShanghaiTech University Knowledge Management System
Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking | |
2024-07-19 | |
状态 | 已发表 |
摘要 | Joint Detection and Embedding(JDE) trackers have demonstrated excellent performance in Multi-Object Tracking(MOT) tasks by incorporating the extraction of appearance features as auxiliary tasks through embedding Re-Identification task(ReID) into the detector, achieving a balance between inference speed and tracking performance. However, solving the competition between the detector and the feature extractor has always been a challenge. Also, the issue of directly embedding the ReID task into MOT has remained unresolved. The lack of high discriminability in appearance features results in their limited utility. In this paper, we propose a new learning approach using cross-correlation to capture temporal information of objects. The feature extraction network is no longer trained solely on appearance features from each frame but learns richer motion features by utilizing feature heatmaps from consecutive frames, addressing the challenge of inter-class feature similarity. Furthermore, we apply our learning approach to a more lightweight feature extraction network, and treat the feature matching scores as strong cues rather than auxiliary cues, employing a appropriate weight calculation to reflect the compatibility between our obtained features and the MOT task. Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks, i.e., MOT17, MOT20, and DanceTrack datasets. Specifically, on the DanceTrack test set, we achieve 56.8 HOTA, 58.1 IDF1 and 92.5 MOTA, making it the best online tracker that can achieve real-time performance. Comparative evaluations with other trackers prove that our tracker achieves the best balance between speed, robustness and accuracy. |
关键词 | Multiple object tracking cross-correlation lightweight networks re-identification |
DOI | arXiv:2407.14086 |
相关网址 | 查看原文 |
出处 | Arxiv |
WOS记录号 | PPRN:91011773 |
WOS类目 | Computer Science, Software Engineering |
文献类型 | 预印本 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/408328 |
专题 | 信息科学与技术学院 信息科学与技术学院_硕士生 |
通讯作者 | Zhou, Xue; Li, Liang |
作者单位 | 1.Shanghaitech Univ, Sch Informat Sci & Technol, Shanghai 201210, Peoples R China 2.Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu 611731, Sichuan, Peoples R China 3.Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence Syst, Beijing 100190, Peoples R China 4.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 101408, Peoples R China 5.Chinese Acad Sci, Inst Automat, NLPR, Beijing 101408, Peoples R China 6.Birkbeck Coll, Dept Comp Sci & Informat Syst, London WC1E 7HX, England 7.Univ Elect Sci & Technol China, Shenzhen Inst Adv Study, Shenzhen 518110, Peoples R China 8.Beijing Inst Basic Med Sci, Beijing 100850, Peoples R China |
推荐引用方式 GB/T 7714 | Zhang, Yunfei,Liang, Chao,Gao, Jin,et al. Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking. 2024. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。