Reformulating Queries for Duplicate Bug Report Detection

Oscar Chaparro, Juan Manuel Florez, Unnati Singh, Andrian Marcus
The University of Texas at Dallas, Richardson, TX, USA

This web page contains the replication package of our SANER'19 paper. Each attached ZIP file below includes a README file that provides more information about its contents.

Data set

ZIP file that contains the complete data set of bug reports used in this research, including coded bug reports. It also contains the duplicate detection data, namely, initial and reduced queries, bug corpora, and duplicate reports.

Tools and Preprocessing

ZIP file that contains the tools used in this research as well as the list of stop words used to preprocess the source code documents and queries.


ZIP file that contains detailed results of our empirical study.