ShanghaiTech University Knowledge Management System
Benchmarking Data Science Agents | |
2024 | |
会议录名称 | PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS |
摘要 | In the era of data-driven decision-making, the complexity of data analysis necessitates advanced expertise and tools of data science, presenting significant challenges even for specialists. Large Language Models (LLMs) have emerged as promising aids as data science agents, assisting humans in data analysis and processing. Yet their practical efficacy remains constrained by the varied demands of real-world applications and complicated analytical process. In this paper, we introduce DSEval a novel evaluation paradigm, as well as a series of innovative benchmarks tailored for assessing the performance of these agents throughout the entire data science lifecycle. Incorporating a novel bootstrapped annotation method, we streamline dataset preparation, improve the evaluation coverage, and expand benchmarking comprehensiveness. Our findings uncover prevalent obstacles and provide critical insights to inform future advancements in the field.* |
会议名称 | 62nd Annual Meeting of the Association-for-Computational-Linguistics (ACL) / Student Research Workshop (SRW) |
出版地 | 209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA |
会议地点 | null,Bangkok,THAILAND |
会议日期 | AUG 11-16, 2024 |
URL | 查看原文 |
收录类别 | PPRN.PPRN ; CPCI-S |
语种 | 英语 |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods |
WOS记录号 | WOS:001356729805044 |
出版者 | ASSOC COMPUTATIONAL LINGUISTICS-ACL |
文献类型 | 会议论文 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/445522 |
专题 | 信息科学与技术学院_本科生 信息科学与技术学院_PI研究组_任侃组 |
通讯作者 | Ren, Kan |
作者单位 | 1.Microsoft Res, Bangalore, Karnataka, India 2.ShanghaiTech Univ, Shanghai, Peoples R China |
通讯作者单位 | 上海科技大学 |
推荐引用方式 GB/T 7714 | Zhang, Yuge,Jiang, Qiyang,Han, Xingyu,et al. Benchmarking Data Science Agents[C]. 209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA:ASSOC COMPUTATIONAL LINGUISTICS-ACL,2024. |
条目包含的文件 | ||||||
条目无相关文件。 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。