ShanghaiTech University Knowledge Management System
An LLM-based Readability Measurement for Unit Tests' Context-aware Inputs | |
2024-07-31 | |
状态 | 已发表 |
摘要 | Automated test techniques usually generate unit tests with higher code coverage than manual tests. However, the readability of automated tests is crucial for code comprehension and maintenance. The readability of unit tests involves many aspects. In this paper, we focus on test inputs. The central limitation of existing studies on input readability is that they focus on test codes alone without taking the tested source codes into consideration, making them either ignore different source codes’ different readability requirements or require manual efforts to write readable inputs. However, we observe that the source codes specify the contexts that test inputs must satisfy. Based on such observation, we introduce the Context Consistency Criterion (a.k.a, C3), which is a readability measurement tool that leverages Large Language Models to extract primitive-type (including string-type) parameters’ readability contexts from the source codes and checks whether test inputs are consistent with those contexts. We have also proposed EvoSuiteC3. It leverages C3’s extracted contexts to help EvoSuite generate readable test inputs. We have evaluated C3’s performance on 409 JAVA classes and compared manual and automated tests’ readability under C3 measurement. The results are two-fold. First, The Precision, Recall, and F1-Score of C3’s mined readability contexts are 84.4%, 83%, and 83.7%, respectively. Second, under C3’s measurement, the string-type input readability scores of EvoSuiteC3, ChatUniTest (an LLM-based test generation tool), manual tests, and two traditional tools (EvoSuite and Randoop) are 90%, 83%, 68%, 8%, and 8%, showing the traditional tools’ inability in generating readable string-type inputs. We have conducted a survey based on the questionnaires collected from 30 programmers with varied backgrounds. The results reveal that when C3 identifies readable differences between tests, programmers tend to give similar opinions of the test’s readability of C3. |
关键词 | readability test generation large language models |
DOI | arXiv:2407.21369 |
相关网址 | 查看原文 |
出处 | Arxiv |
WOS记录号 | PPRN:91174928 |
WOS类目 | Computer Science, Software Engineering |
文献类型 | 预印本 |
条目标识符 | https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/408355 |
专题 | 信息科学与技术学院 信息科学与技术学院_硕士生 信息科学与技术学院_博士生 信息科学与技术学院_PI研究组_何静竹组 |
通讯作者 | He, Jingzhu |
作者单位 | 1.ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China 2.Shanghai Jiao Tong Univ, Shanghai, Peoples R China 3.Univ Glasgow, Glasgow, Scotland |
推荐引用方式 GB/T 7714 | Zhou, Zhichao,Tang, Yutian,Lin, Yun,et al. An LLM-based Readability Measurement for Unit Tests' Context-aware Inputs. 2024. |
条目包含的文件 | ||||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 |
修改评论
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。