3D hand pose and mesh estimation via a generic Topology-aware Transformer model
2024-05-03
发表期刊FRONTIERS IN NEUROROBOTICS (IF:2.6[JCR-2023],3.1[5-Year])
ISSN1662-5218
EISSN1662-5218
卷号18
发表状态已发表
DOI10.3389/fnbot.2024.1395652
摘要

In Human-Robot Interaction (HRI), accurate 3D hand pose and mesh estimation hold critical importance. However, inferring reasonable and accurate poses in severe self-occlusion and high self-similarity remains an inherent challenge. In order to alleviate the ambiguity caused by invisible and similar joints during HRI, we propose a new Topology-aware Transformer network named HandGCNFormer with depth image as input, incorporating prior knowledge of hand kinematic topology into the network while modeling long-range contextual information. Specifically, we propose a novel Graphformer decoder with an additional Node-offset Graph Convolutional layer (NoffGConv). The Graphformer decoder optimizes the synergy between the Transformer and GCN, capturing long-range dependencies and local topological connections between joints. On top of that, we replace the standard MLP prediction head with a novel Topology-aware head to better exploit local topological constraints for more reasonable and accurate poses. Our method achieves state-of-the-art 3D hand pose estimation performance on four challenging datasets, including Hands2017, NYU, ICVL, and MSRA. To further demonstrate the effectiveness and scalability of our proposed Graphformer Decoder and Topology aware head, we extend our framework to HandGCNFormer-Mesh for the 3D hand mesh estimation task. The extended framework efficiently integrates a shape regressor with the original Graphformer Decoder and Topology aware head, producing Mano parameters. The results on the HO-3D dataset, which contains various and challenging occlusions, show that our HandGCNFormer-Mesh achieves competitive results compared to previous state-of-the-art 3D hand mesh estimation methods.

关键词3D hand pose estimation HandGCNFormer 3D hand mesh estimation Graphformer Transformer GCN
URL查看原文
收录类别SCI ; EI
语种英语
资助项目Shanghai Municipal Science and Technology Major Project (Zhangjiang Lab)[2018SHZDZX01] ; Shanghai Academic Research Leader[22XD1424500]
WOS研究方向Computer Science ; Robotics ; Neurosciences & Neurology
WOS类目Computer Science, Artificial Intelligence ; Robotics ; Neurosciences
WOS记录号WOS:001225462800001
出版者FRONTIERS MEDIA SA
EI入藏号20242116135619
EI主题词Human robot interaction
EI分类号723.2 Data Processing and Image Processing ; 723.5 Computer Applications ; 731.5 Robotics ; 921.4 Combinatorial Mathematics, Includes Graph Theory, Set Theory
原始文献类型Journal article (JA)
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/381242
专题信息科学与技术学院_特聘教授组_张晓林组
通讯作者Chen, Lili
作者单位
1.Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, Shanghai, Peoples R China
2.Univ Chinese Acad Sci, Beijing, Peoples R China
3.ShanghaiTech Univ, Shanghai, Peoples R China
推荐引用方式
GB/T 7714
Yu, Shaoqi,Wang, Yintong,Chen, Lili,et al. 3D hand pose and mesh estimation via a generic Topology-aware Transformer model[J]. FRONTIERS IN NEUROROBOTICS,2024,18.
APA Yu, Shaoqi,Wang, Yintong,Chen, Lili,Zhang, Xiaolin,&Li, Jiamao.(2024).3D hand pose and mesh estimation via a generic Topology-aware Transformer model.FRONTIERS IN NEUROROBOTICS,18.
MLA Yu, Shaoqi,et al."3D hand pose and mesh estimation via a generic Topology-aware Transformer model".FRONTIERS IN NEUROROBOTICS 18(2024).
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Yu, Shaoqi]的文章
[Wang, Yintong]的文章
[Chen, Lili]的文章
百度学术
百度学术中相似的文章
[Yu, Shaoqi]的文章
[Wang, Yintong]的文章
[Chen, Lili]的文章
必应学术
必应学术中相似的文章
[Yu, Shaoqi]的文章
[Wang, Yintong]的文章
[Chen, Lili]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。