Feature Selection and Embedding Based Cross Project Framework for Identifying Crashing Fault Residence
2021-03
发表期刊INFORMATION AND SOFTWARE TECHNOLOGY
ISSN0950-5849
卷号131
发表状态已发表
DOI10.1016/j.infsof.2020.106452
摘要

Context: The automatically produced crash reports are able to analyze the root of fault causing the crash (crashing fault for short) which is a critical activity for software quality assurance.

Objective: Correctly predicting the existence of crashing fault residence in stack traces of crash report can speed up program debugging process and optimize debugging efforts. Existing work focused on the collected label information from bug-fixing logs, and the extracted features of crash instances from stack traces and source code for Identification of Crashing Fault Residence (ICFR) of newly-submitted crashes. This work develops a novel cross project ICFR framework to address the data scarcity problem by using labeled crash data of other project for the ICFR task of the project at hand. This framework removes irrelevant features, reduces distribution differences, and eases the class imbalance issue of cross project data since these factors may negatively impact the ICFR performance.

Method: The proposed framework, called FSE, combines Feature Selection and feature Embedding techniques. The FSE framework first uses an information gain ratio based feature ranking method to select a relevant feature subset for cross project data, and then employs a state-of-the-art Weighted Balanced Distribution Adaptation (WBDA) method to map features of cross project data into a common space. WBDA considers both marginal and conditional distributions as well as their weights to reduce data distribution discrepancies. Besides, WBDA balances the class proportion of each project data to alleviate the class imbalance issue.

Results: We conduct experiments on 7 projects to evaluate the performance of our FSE framework. The results show that FSE outperforms 25 methods under comparison.

Conclusion: This work proposes a cross project learning framework for ICFR, which uses feature selection and embedding to remove irrelevant features and reduce distribution differences, respectively. The results illustrate the performance superiority of our FSE framework.

关键词Crashing fault Stack trace Feature selection Feature embedding Cross project framework Computer software selection and evaluation Embeddings Feature extraction Program debugging Quality assurance Software quality Conditional distribution Critical activities Data distribution Debugging efforts Information gain ratio Label information Relevant features
收录类别SCI ; SCIE ; EI
语种英语
出版者Elsevier B.V.
EI入藏号20204809549801
EI主题词Data reduction
EI分类号723 Computer Software, Data Handling and Applications ; 913.3 Quality Assurance and Control
原始文献类型Journal article (JA)
引用统计
文献类型期刊论文
条目标识符https://kms.shanghaitech.edu.cn/handle/2MSLDSTB/123627
专题信息科学与技术学院_PI研究组_唐宇田组
通讯作者Yan,Meng; Luo,Xiapu
作者单位
1.Chongqing University
2.Macau University of Science and Technology
3.City University of Hong Kong
4.The Hong Kong Polytechnic University
5.ShanghaiTech University
推荐引用方式
GB/T 7714
Xu,Zhou,Zhang,Tao,Keung, Jacky,et al. Feature Selection and Embedding Based Cross Project Framework for Identifying Crashing Fault Residence[J]. INFORMATION AND SOFTWARE TECHNOLOGY,2021,131.
APA Xu,Zhou.,Zhang,Tao.,Keung, Jacky.,Yan,Meng.,Luo,Xiapu.,...&Tang,Yutian.(2021).Feature Selection and Embedding Based Cross Project Framework for Identifying Crashing Fault Residence.INFORMATION AND SOFTWARE TECHNOLOGY,131.
MLA Xu,Zhou,et al."Feature Selection and Embedding Based Cross Project Framework for Identifying Crashing Fault Residence".INFORMATION AND SOFTWARE TECHNOLOGY 131(2021).
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Xu,Zhou]的文章
[Zhang,Tao]的文章
[Keung, Jacky]的文章
百度学术
百度学术中相似的文章
[Xu,Zhou]的文章
[Zhang,Tao]的文章
[Keung, Jacky]的文章
必应学术
必应学术中相似的文章
[Xu,Zhou]的文章
[Zhang,Tao]的文章
[Keung, Jacky]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。