Benchmarking Data Science Agents
2024
会议录名称PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS
摘要In the era of data-driven decision-making, the complexity of data analysis necessitates advanced expertise and tools of data science, presenting significant challenges even for specialists. Large Language Models (LLMs) have emerged as promising aids as data science agents, assisting humans in data analysis and processing. Yet their practical efficacy remains constrained by the varied demands of real-world applications and complicated analytical process. In this paper, we introduce DSEval a novel evaluation paradigm, as well as a series of innovative benchmarks tailored for assessing the performance of these agents throughout the entire data science lifecycle. Incorporating a novel bootstrapped annotation method, we streamline dataset preparation, improve the evaluation coverage, and expand benchmarking comprehensiveness. Our findings uncover prevalent obstacles and provide critical insights to inform future advancements in the field.*
会议名称62nd Annual Meeting of the Association-for-Computational-Linguistics (ACL) / Student Research Workshop (SRW)
出版地209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA
会议地点null,Bangkok,THAILAND
会议日期AUG 11-16, 2024
URL查看原文
收录类别PPRN.PPRN ; CPCI-S
语种英语
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods
WOS记录号WOS:001356729805044
出版者ASSOC COMPUTATIONAL LINGUISTICS-ACL
文献类型会议论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/445522
专题信息科学与技术学院_本科生
信息科学与技术学院_PI研究组_任侃组
通讯作者Ren, Kan
作者单位
1.Microsoft Res, Bangalore, Karnataka, India
2.ShanghaiTech Univ, Shanghai, Peoples R China
通讯作者单位上海科技大学
推荐引用方式
GB/T 7714
Zhang, Yuge,Jiang, Qiyang,Han, Xingyu,et al. Benchmarking Data Science Agents[C]. 209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA:ASSOC COMPUTATIONAL LINGUISTICS-ACL,2024.
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Zhang, Yuge]的文章
[Jiang, Qiyang]的文章
[Han, Xingyu]的文章
百度学术
百度学术中相似的文章
[Zhang, Yuge]的文章
[Jiang, Qiyang]的文章
[Han, Xingyu]的文章
必应学术
必应学术中相似的文章
[Zhang, Yuge]的文章
[Jiang, Qiyang]的文章
[Han, Xingyu]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。