ShanghaiTech University Knowledge Management System
Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation | |
2020-07 | |
会议录名称 | THE 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020)
![]() |
卷号 | Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics |
页码 | 3619–3629 |
发表状态 | 已发表 |
DOI | 10.18653/v1/2020.acl-main.333 |
摘要 | Open-domain dialogue generation has gained increasing attention in Natural Language Processing. Its evaluation requires a holistic means. Human ratings are deemed as the gold standard. As human evaluation is inefficient and costly, an automated substitute is highly desirable. In this paper, we propose holistic evaluation metrics that capture different aspects of open-domain dialogues. Our metrics consist of (1) GPT-2 based context coherence between sentences in a dialogue, (2) GPT-2 based fluency in phrasing, (3) n-gram based diversity in responses to augmented queries, and (4) textual-entailment-inference based logical self-consistency. The empirical validity of our metrics is demonstrated by strong correlations with human judgments. We open source the code and relevant materials. © 2020 Association for Computational Linguistics |
会议录编者/会议主办者 | Amazon ; Apple ; Bloomberg Engineering ; et al. ; Google ; IBM Research AI |
关键词 | Computational linguistics Open systems Open source software Automatic evaluation Dialogue generations Evaluation metrics Gold standards Holistic evaluations Human evaluation ITS evaluation N-grams Self-consistency Textual entailment |
会议名称 | the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020) |
会议地点 | Virtual, Online, United states |
会议日期 | July 5, 2020 - July 10, 2020 |
URL | 查看原文 |
收录类别 | CPCI ; CPCI-S ; EI |
语种 | 英语 |
出版者 | Association for Computational Linguistics (ACL) |
EI入藏号 | 20214411090492 |
EI主题词 | Natural language processing systems |
EI分类号 | 721.1 Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory ; 723 Computer Software, Data Handling and Applications ; 723.2 Data Processing and Image Processing |
原始文献类型 | Conference article (CA) |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/124035 |
专题 | 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_屠可伟组 |
通讯作者 | Han, Wenjuan; Zhou, Linqi |
作者单位 | 1.Department of Statistics, University of California, Los Angeles 2.School of Computing, National University of Singapore, Singapore 3.School of Information Science and Technology, ShanghaiTech University |
推荐引用方式 GB/T 7714 | Pang, Bo,Nijkamp, Erik,Han, Wenjuan,et al. Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation[C]//Amazon, Apple, Bloomberg Engineering, et al., Google, IBM Research AI:Association for Computational Linguistics (ACL),2020:3619–3629. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。