Research on bug report model honored with ASE Most Influential Paper

“Modeling bug report quality” presents a model of bug report that reduces the overall cost of software maintenance.

Professor Westley Weimer and his co-author and former student Pieter Hooimeijer have been recognized with an award for Most Influential Paper at the 2022 IEEE/ACM International Conference on Automated Software Engineering (ASE). The paper, titled “Modeling bug report quality,” presents a model of bug report based on a statistical analysis of surface features of over 27,000 publicly available bug reports for the Mozilla Firefox project. Authored in 2007, the publication is one of two honored with the ASE ten-year retrospective of influential research.

The researchers confronted the high costs and demand for time of software maintenance. As widely-deployed software can often outstrip the resources available to triage bugs, Weimer and Hooimeijer present a model which performs significantly better than chance in terms of precision and recall and “reduces the overall cost of software maintenance in a setting where the average cost of addressing a bug report is more than 2% of the cost of ignoring an important bug report.”

Ultimately developing a basic linear regression model to predict whether bug reports are resolved in a given amount of time, the authors find that analysis of the model “showed that self-reported severity is an important factor in the model’s performance” despite the fact that severity may not be a reliable indicator of a bug’s importance. 

Wes Weimer
Prof. Westley Weimer

“In my opinion, the important part of the work is not the exact method used, but the notion that bug reporting could be approached by machine learning at all,” said Weimer. “These days there are more advanced modeling approaches, such as neural networks, that produce more accurate results. But the high-level idea of modeling human interactions in software engineering — the conversations and reports about bugs, not just the code itself — remains relevant.”

The paper opened up the discussion of extending the model to multiple projects and improving the model itself  in order to yield better performance than the linear least-squares regression it used. It also suggested that “follow-up work could evaluate the effect of such a tool on triage practices and on the model itself.”

“The analysis in this paper might not have happened if Pieter Hooimeijer hadn’t caught a mistake related to whether arrays start at 0 or start at 1. Even results like these, that are viewed as having long-term impact, often had to deal with an ’embarrassing’ mistake or two along the way. A key part of research is being willing to take risks — being willing to try out something that doesn’t work. For many impactful papers, it’s not that people were right the first time, it’s that they were right the last time,” Weimer added.

Weimer is renowned for his leadership in software verification and other advances in computer systems, including development of an automated method to find and fix defects in software programs. He employs machine learning and optimization techniques to explore such topics as graphics, security and wireless sensor networks. He has used medical imaging to investigate the brain activity of software engineers when engaged in programming to better understand how programmers learn. He also serves as the Chair of the DEI Committee at CSE.