ShanghaiTech University Knowledge Management System
DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition | |
2023 | |
会议录名称 | 17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023 |
摘要 | The MultiCoNER II shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios, and it inherits the semantic ambiguity and low-context setting of the MultiCoNER I task. To cope with these problems, the previous top systems in the MultiCoNER I either incorporate the knowledge bases or gazetteers. However, they still suffer from insufficient knowledge, limited context length, single retrieval strategy. In this paper, our team DAMO-NLP proposes a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER. We perform error analysis on the previous top systems and reveal that their performance bottleneck lies in insufficient knowledge. Also, we discover that the limited context length causes the retrieval knowledge to be invisible to the model. To enhance the retrieval context, we incorporate the entity-centric Wikidata knowledge base, while utilizing the infusion approach to broaden the contextual scope of the model. Also, we explore various search strategies and refine the quality of retrieval knowledge. Our system(1) wins 9 out of 13 tracks in the MultiCoNER II shared task. Additionally, we compared our system with ChatGPT, one of the large language models which have unlocked strong capabilities on many tasks. The results show that there is still much room for improvement for ChatGPT on the extraction task. |
会议名称 | 17th International Workshop on Semantic Evaluation (SemEval) |
出版地 | 209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA |
会议地点 | null,Toronto,CANADA |
会议日期 | JUL 13-14, 2023 |
URL | 查看原文 |
收录类别 | CPCI-S ; CPCI-SH |
语种 | 英语 |
WOS研究方向 | Computer Science ; Linguistics |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Linguistics |
WOS记录号 | WOS:001281001900276 |
出版者 | ASSOC COMPUTATIONAL LINGUISTICS-ACL |
引用统计 | 正在获取...
|
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/348156 |
专题 | 信息科学与技术学院 信息科学与技术学院_PI研究组_屠可伟组 信息科学与技术学院_博士生 |
作者单位 | 1.Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China 2.Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China 3.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China 4.Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China |
推荐引用方式 GB/T 7714 | Tan, Zeqi,Huang, Shen,Jia, Zixia,et al. DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition[C]. 209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA:ASSOC COMPUTATIONAL LINGUISTICS-ACL,2023. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。