消息
×
loading..
Spatial enhanced multi-level alignment learning for text-image person re-identification with coupled noisy labels
2025
发表期刊MULTIMEDIA SYSTEMS (IF:3.5[JCR-2023],3.1[5-Year])
ISSN0942-4962
EISSN1432-1882
卷号31期号:2
发表状态已发表
DOI10.1007/s00530-025-01730-8
摘要Text-image person re-identification (TIReID) is a cross-modal retrieval task that aims to query person images with corresponding identities through natural language descriptions. The key to this task is the effective alignment of cross-modal features between image-text pairs. Many methods have achieved promising experimental results by fine-tuning pre-trained visual language models. However, existing methods largely depend on accurate and high-quality text annotations, which require substantial time and resources. During dataset construction, this reliance results in incorrect sample pair matching and introduces coupled noisy labels in TIReID. Although some prior studies have achieved relatively robust outcomes in addressing the noise correspondence problem in TIReID, they still face several challenges: (1) Lacking spatial detail of person images: Previous research predominantly utilizes pre-trained vision transformers for visual feature extraction. However, simply relying on position encoding in ViT can result in insufficient learning of the spatial structural characteristics of pedestrians, thereby limiting the effectiveness of the model in distinguishing subtle identity-related variations. (2) Neglect of identity category relationships: Many prior approaches primarily address noise relationships using sample-level similarity and loss responses, often overlooking the predicted relationships between identity categories. These relationships are crucial for guiding the model to focus on shared identity characteristics. To address these challenges, we propose Spatial Enhanced Multi-Level Alignment Learning (SE-MLAL). SE-MLAL includes a consistent noise detection module that predicts the correctness of sample pairs’ correspondence at both the sample and class level, leveraging the consistency between these predictions to achieve accurate dataset division. Building upon this foundation, we employ sample-wise triplet loss and class-wise alignment loss to facilitate hierarchical feature alignment loss. Experimental results across three datasets substantiate the effectiveness and robustness of our method. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
关键词Adversarial machine learning Contrastive Learning Feature extraction Image coding Image denoising Natural language processing systems Optical character recognition Query languages Structured Query Language Cross-modal Cross-modal alignment Language description Multilevels Natural languages Noisy correspondence Noisy labels Person re identifications Text images Text-image person re-identification
收录类别EI
语种英语
出版者Springer Science and Business Media Deutschland GmbH
EI入藏号20251118058536
EI主题词Visual languages
EI分类号1101 Artificial Intelligence ; 1101.2 Machine Learning ; 1106.1.1 Computer Programming Languages ; 1106.2 Data Handling and Data Processing ; 1106.3.1 Image Processing ; 1106.4 Database Systems ; 1106.7 Computational Linguistics ; 1106.8 Computer Vision ; 716.1 Information Theory and Signal Processing ; 741.1 Light/Optics
原始文献类型Journal article (JA)
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/496975
专题信息科学与技术学院_硕士生
通讯作者Yongxi Li(李泳锡)
作者单位
1.上海科技大学
2.中科院自动化所
3.北京航空航天大学
第一作者单位上海科技大学
第一作者的第一单位上海科技大学
推荐引用方式
GB/T 7714
Jiacheng Zhao,Haojie Che,Yongxi Li. Spatial enhanced multi-level alignment learning for text-image person re-identification with coupled noisy labels[J]. MULTIMEDIA SYSTEMS,2025,31(2).
APA Jiacheng Zhao,Haojie Che,&Yongxi Li.(2025).Spatial enhanced multi-level alignment learning for text-image person re-identification with coupled noisy labels.MULTIMEDIA SYSTEMS,31(2).
MLA Jiacheng Zhao,et al."Spatial enhanced multi-level alignment learning for text-image person re-identification with coupled noisy labels".MULTIMEDIA SYSTEMS 31.2(2025).
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Jiacheng Zhao(赵佳程)]的文章
[Haojie Che(车浩杰)]的文章
[Yongxi Li(李泳锡)]的文章
百度学术
百度学术中相似的文章
[Jiacheng Zhao(赵佳程)]的文章
[Haojie Che(车浩杰)]的文章
[Yongxi Li(李泳锡)]的文章
必应学术
必应学术中相似的文章
[Jiacheng Zhao(赵佳程)]的文章
[Haojie Che(车浩杰)]的文章
[Yongxi Li(李泳锡)]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。