| |||||||
ShanghaiTech University Knowledge Management System
Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation | |
2024 | |
发表期刊 | IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (IF:6.7[JCR-2023],7.1[5-Year]) |
ISSN | 2168-2194 |
EISSN | 2168-2208 |
卷号 | PP期号:99页码:1-12 |
发表状态 | 已发表 |
DOI | 10.1109/JBHI.2024.3393018 |
摘要 | Generating radiology reports automatically reduces the workload of radiologists and helps the diagnoses of specific diseases. Many existing methods take this task as modality transfer process. However, since the key information related to disease accounts for a small proportion in both image and report, it is hard for the model to learn the latent relation between the radiology image and its report, thus failing to generate fluent and accurate radiology reports. To tackle this problem, we propose a memory-based cross-modal semantic alignment model (MCSAM) following an encoder-decoder paradigm. MCSAM includes a well initialized long-term clinical memory bank to learn disease-related representations as well as prior knowledge for different modalities to retrieve and use the retrieved memory to perform feature consolidation. To ensure the semantic consistency of the retrieved cross modal prior knowledge, a cross-modal semantic alignment module (SAM) is proposed. SAM is also able to generate semantic visual feature embeddings which can be added to the decoder and benefits report generation. More importantly, to memorize the state and additional information while generating reports with the decoder, we use learnable memory tokens which can be seen as prompts. Extensive experiments demonstrate the promising performance of our proposed method which generates state-of-the-art performance on the MIMIC-CXR dataset. IEEE |
关键词 | Decoding Diagnosis Job analysis Radiology Semantic Web Cross modality Cross-modal Decoding Knowledge graphs Neural-networks Radiology report generation Radiology reports Report generation Semantic alignments Task analysis |
URL | 查看原文 |
收录类别 | EI |
语种 | 英语 |
出版者 | Institute of Electrical and Electronics Engineers Inc. |
EI入藏号 | 20241815997600 |
EI主题词 | Semantics |
EI分类号 | 461.6 Medicine and Pharmacology ; 622.3 Radioactive Material Applications ; 723 Computer Software, Data Handling and Applications ; 723.2 Data Processing and Image Processing ; 903 Information Science |
原始文献类型 | Article in Press |
来源库 | IEEE |
引用统计 | 正在获取...
|
文献类型 | 期刊论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/370116 |
专题 | 生物医学工程学院 生物医学工程学院_PI研究组_张寒组 生物医学工程学院_硕士生 |
通讯作者 | Ma, Liyan |
作者单位 | 1.School of Computer Engineering and Science, Shanghai University, Shanghai, China 2.Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 3.School of Biomedical Engineering, Shanghaitech University, Shanghai, China |
第一作者单位 | 生物医学工程学院 |
推荐引用方式 GB/T 7714 | Tao, Yitian,Ma, Liyan,Yu, Jing,et al. Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation[J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS,2024,PP(99):1-12. |
APA | Tao, Yitian,Ma, Liyan,Yu, Jing,&Zhang, Han.(2024).Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation.IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS,PP(99),1-12. |
MLA | Tao, Yitian,et al."Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation".IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS PP.99(2024):1-12. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
个性服务 |
查看访问统计 |
谷歌学术 |
谷歌学术中相似的文章 |
[Tao, Yitian]的文章 |
[Ma, Liyan]的文章 |
[Yu, Jing]的文章 |
百度学术 |
百度学术中相似的文章 |
[Tao, Yitian]的文章 |
[Ma, Liyan]的文章 |
[Yu, Jing]的文章 |
必应学术 |
必应学术中相似的文章 |
[Tao, Yitian]的文章 |
[Ma, Liyan]的文章 |
[Yu, Jing]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。