Treffer: Software Defect Prediction Models. A Replication and Extension Study.

Title:
Software Defect Prediction Models. A Replication and Extension Study.
Authors:
Camelia-Petrina, Nadejde1 (AUTHOR), Andreea, Vescan1 (AUTHOR) andreea.vescan@ubbcluj.ro
Source:
Procedia Computer Science. 2025, Vol. 270, p1936-1945. 10p.
Database:
Supplemental Index

Weitere Informationen

Predicting software defects is crucial for ensuring reliable systems and is a critical concern for the industry. Automating the Defect detection process in software programs can significantly reduce errors, development time, and costs. This paper has two primary aims: to generalize the findings of a previous study through replication and to provide new insights through an extension study using a different prediction model. For the replication study, three methods are used: Naive Bayes (NB), Decision Tree (DT), and Random Forest (RF) algorithm. All models used in this research were used before and after the Hybrid Feature Selection (HFS) method. This paper also provides an extension study that incorporates Support Vector Machines (SVM) that allowed a more comprehensive understanding of the impact of HFS and oversampling on classifier performance. The replicated results confirmed the original findings: the Random Forest algorithm consistently achieved the highest accuracy across datasets. In the extension study, SVM demonstrated superior performance in detecting minority class defects, as reflected in its higher Matthew's Correlation Coefficient (MCC) scores, particularly for the PC3 dataset. This research establishes a foundation for the use of SVM along with other classifiers to address challenges in software Defect prediction. [ABSTRACT FROM AUTHOR]