Abstract:
Software fault is a hidden functional defect that exists in a system/program that harms typical processability. In the world, the development of software-based systems and software Companies are growing from the previous years due to increasing of technology or its benefit. But Ethiopia is out of this development and still only has few software companies. Some of these undergoing projects are also complex in that they cannot be managed by the old ways such as code review/testing. The manual assessment of software development activities becomes a more challenging activity as the size and complexity of the software project increases as well as more software faults are occurring. Because of this, alternative methods that are used to predict potential effects of software metrics (size and complexity of code), is a machine learning-based software fault prediction model. Various researchers have been done software fault prediction models using single classifiers and ensemble learning methods using different product metrics. But the researcher has been conducted using few class-level metrics and stacking techniques. The main objective of this research was to build a software fault prediction model using the machine learning approach. To build the model, machine learning algorithms such as J48, Support Vector Machine, Naive Bayes, Random Forest, and a combination of single classifiers using the stacking method were used. The experimentation is performed on the PROP project V4 dataset using object-oriented metrics from PROMISE database usage. Based on performance evaluator, the constructed prediction models like J48 (96.20%), Naive Bayes (85.11%), SVM (86.20), Random Forest (96.83%), and the researcher proposed stacking technique to combine and analyze the accuracy and performance by the usage of Random Forest and J48 was achieved 97.59%. This indicates the proposed stacking technique is the best method in comparison to single perspectives. The ability to measure the software fault was important for minimizing the cost and improving the overall effectiveness of the testing process. Regarding the results of experiments, the proposed combined method has higher accuracy than a single classifier method.