ShanghaiTech University Knowledge Management System
Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation | |
2020-07 | |
Source Publication | THE 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020)
![]() |
Volume | Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics |
Pages | 3619–3629 |
Status | 已发表 |
DOI | 10.18653/v1/2020.acl-main.333 |
Abstract | Open-domain dialogue generation has gained increasing attention in Natural Language Processing. Its evaluation requires a holistic means. Human ratings are deemed as the gold standard. As human evaluation is inefficient and costly, an automated substitute is highly desirable. In this paper, we propose holistic evaluation metrics that capture different aspects of open-domain dialogues. Our metrics consist of (1) GPT-2 based context coherence between sentences in a dialogue, (2) GPT-2 based fluency in phrasing, (3) n-gram based diversity in responses to augmented queries, and (4) textual-entailment-inference based logical self-consistency. The empirical validity of our metrics is demonstrated by strong correlations with human judgments. We open source the code and relevant materials. © 2020 Association for Computational Linguistics |
Author of Source | Amazon ; Apple ; Bloomberg Engineering ; et al. ; Google ; IBM Research AI |
Keyword | Computational linguistics Open systems Open source software Automatic evaluation Dialogue generations Evaluation metrics Gold standards Holistic evaluations Human evaluation ITS evaluation N-grams Self-consistency Textual entailment |
Conference Name | the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020) |
Conference Place | Virtual, Online, United states |
Conference Date | July 5, 2020 - July 10, 2020 |
URL | 查看原文 |
Indexed By | CPCI ; CPCI-S ; EI |
Language | 英语 |
Publisher | Association for Computational Linguistics (ACL) |
EI Accession Number | 20214411090492 |
EI Keywords | Natural language processing systems |
EI Classification Number | 721.1 Computer Theory, Includes Formal Logic, Automata Theory, Switching Theory, Programming Theory ; 723 Computer Software, Data Handling and Applications ; 723.2 Data Processing and Image Processing |
Original Document Type | Conference article (CA) |
Document Type | 会议论文 |
Identifier | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/124035 |
Collection | 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_屠可伟组 |
Corresponding Author | Han, Wenjuan; Zhou, Linqi |
Affiliation | 1.Department of Statistics, University of California, Los Angeles 2.School of Computing, National University of Singapore, Singapore 3.School of Information Science and Technology, ShanghaiTech University |
Recommended Citation GB/T 7714 | Pang, Bo,Nijkamp, Erik,Han, Wenjuan,et al. Towards Holistic and Automatic Evaluation of Open-Domain Dialogue Generation[C]//Amazon, Apple, Bloomberg Engineering, et al., Google, IBM Research AI:Association for Computational Linguistics (ACL),2020:3619–3629. |
Files in This Item: | Download All | |||||
File Name/Size | DocType | Version | Access | License |
Edit Comment
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.