Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation
2020-07
会议录名称THE 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020)
卷号Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
页码3619–3629
发表状态已发表
DOI10.18653/v1/2020.acl-main.333
摘要

Open-domain dialogue generation has gained increasing attention in Natural Language Processing. Its evaluation requires a holistic means. Human ratings are deemed as the gold standard. As human evaluation is inefficient and costly, an automated substitute is highly desirable. In this paper, we propose holistic evaluation metrics that capture different aspects of open-domain dialogues. Our metrics consist of (1) GPT-2 based context coherence between sentences in a dialogue, (2) GPT-2 based fluency in phrasing, (3) n-gram based diversity in responses to augmented queries, and (4) textual-entailment-inference based logical self-consistency. The empirical validity of our metrics is demonstrated by strong correlations with human judgments. We open source the code and relevant materials. © 2020 Association for Computational Linguistics

会议录编者/会议主办者Amazon ; Apple ; Bloomberg Engineering ; et al. ; Google ; IBM Research AI
关键词Computational linguistics Open systems Open source software Automatic evaluation Dialogue generations Evaluation metrics Gold standards Holistic evaluations Human evaluation ITS evaluation N-grams Self-consistency Textual entailment
会议名称the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020)
会议地点Virtual, Online, United states
会议日期July 5, 2020 - July 10, 2020
URL查看原文
收录类别CPCI ; CPCI-S ; EI
语种英语
出版者Association for Computational Linguistics (ACL)
EI入藏号20214411090492
EI主题词Natural language processing systems
EI分类号721.1 Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory ; 723 Computer Software, Data Handling and Applications ; 723.2 Data Processing and Image Processing
原始文献类型Conference article (CA)
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/124035
专题信息科学与技术学院_博士生
信息科学与技术学院_PI研究组_屠可伟组
通讯作者Han, Wenjuan; Zhou, Linqi
作者单位
1.Department of Statistics, University of California, Los Angeles
2.School of Computing, National University of Singapore, Singapore
3.School of Information Science and Technology, ShanghaiTech University
推荐引用方式
GB/T 7714
Pang, Bo,Nijkamp, Erik,Han, Wenjuan,et al. Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation[C]//Amazon, Apple, Bloomberg Engineering, et al., Google, IBM Research AI:Association for Computational Linguistics (ACL),2020:3619–3629.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Pang, Bo]的文章
[Nijkamp, Erik]的文章
[Han, Wenjuan]的文章
百度学术
百度学术中相似的文章
[Pang, Bo]的文章
[Nijkamp, Erik]的文章
[Han, Wenjuan]的文章
必应学术
必应学术中相似的文章
[Pang, Bo]的文章
[Nijkamp, Erik]的文章
[Han, Wenjuan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10.18653@v1@2020.acl-main.333.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。