Open Access Open Access  Restricted Access Subscription or Fee Access

Software Fault Prediction Using an Optimised Feature Selection Process Based on a Genetic Algorithm


(*) Corresponding author


Authors' affiliations


DOI: https://doi.org/10.15866/irea.v11i5.23539

Abstract


Software Fault Prediction (SFP) is crucial for preemptively identifying software faults. This research tackles challenges in SFP, addressing class imbalance, significant metrics, and feature selection. To counter class imbalance, the Random Over-sampling method is employed, enhancing model predictive capacity for both faulty and non-faulty instances. Software metric categories like size, cohesion, complexity, coupling, and documentation are examined to identify influential metrics. Feature selection is optimized using a Modified Genetic Algorithm (GA), reducing dimensionality while maintaining efficiency. Experiments on a diverse open-source dataset demonstrate that this approach substantially improves prediction accuracy compared to traditional methods. This study introduces a comprehensive framework for robust SFP models. By handling class imbalance, recognizing significant metrics, and implementing effective feature selection, the proposed approach empowers practitioners to build more accurate fault prediction models, enhancing software quality and reliability.
Copyright © 2023 Praise Worthy Prize - All rights reserved.

Keywords


Fault Prediction Model; Feature Selection; Machine Learning; Genetic Algorithm; Bayesian Optimisation; Random Forest

Full Text:

PDF


References


Herkert, J., Borenstein, J. and Miller, K. (2020) The Boeing 737 MAX: Lessons for Engineering Ethics, Science and Engineering Ethics, 26(6), pp. 2957-2974.
https://doi.org/10.1007/s11948-020-00252-y

Malhotra, R. (2015) A systematic review of machine learning techniques for software fault prediction, Applied Soft Computing, 27, pp. 504-518.
https://doi.org/10.1016/j.asoc.2014.11.023

Kumar, S. and Rathore, SS (2018) Software fault prediction a road map. Singapore: Springer Singapore.
https://doi.org/10.1007/978-981-10-8715-8

Pandey, S.K., Tripathi, A.K. (2021). An empirical study toward dealing with noise and class imbalance issues in software defect prediction. Soft Comput 25, 13465-13492.
https://doi.org/10.1007/s00500-021-06096-3

Singh, P. and Verma, S. (2012) Empirical investigation of fault prediction capability of object-oriented metrics of open source software, 2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE) [Preprint].
https://doi.org/10.1109/JCSSE.2012.6261973

Radjenović, D. et al. (2013) Software fault prediction metrics: A systematic literature review, Information and Software Technology, 55(8), pp. 1397-1418.
https://doi.org/10.1016/j.infsof.2013.02.009

T. B. Alakus, R. Das and I. Turkoglu, An Overview of Quality Metrics Used in Estimating Software Faults, 2019 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey, 2019, pp. 1-6.
https://doi.org/10.1109/IDAP.2019.8875925

Singh, A., Bhatia, R. and Singhrova, A. (2018) Taxonomy of machine learning algorithms in software fault prediction using Object Oriented Metrics, Procedia Computer Science, 132, pp. 993-1001.
https://doi.org/10.1016/j.procs.2018.05.115

Elmishali, A. and Kalech, M. (2023) Issues-driven features for software fault prediction, Information and Software Technology, 155, p. 107102.
https://doi.org/10.1016/j.infsof.2022.107102

M. G. Bindu and M. K. Sabu, A Hybrid Feature Selection Approach Using Artificial Bee Colony and Genetic Algorithm, 2020 Advanced Computing and Communication Technologies for High Performance Applications (ACCTHPA), Cochin, India, 2020, pp. 211-216.
https://doi.org/10.1109/ACCTHPA49271.2020.9213197

R. Jothi, A Comparative Study of Unsupervised Learning Algorithms for Software Fault Prediction, 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2018, pp. 741-745.
https://doi.org/10.1109/ICCONS.2018.8663154

Batool, I. and Khan, T.A. (2022) Software fault prediction using data mining, machine learning and Deep Learning Techniques: A Systematic Literature Review, Computers and Electrical Engineering, 100, p. 107886.
https://doi.org/10.1016/j.compeleceng.2022.107886

Kaur, I. and Kaur, A. (2021) Comparative analysis of software fault prediction using various categories of classifiers, International Journal of System Assurance Engineering and Management, 12(3), pp. 520-535.
https://doi.org/10.1007/s13198-021-01110-1

R. T. Selvi and P. Patchaiammal, Fault Prediction for Large Scale Projects Using Deep Learning Techniques, 2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore, India, 2022, pp. 482-489.
https://doi.org/10.1109/ICAIS53314.2022.9743054

Malhotra, R. (2014) Comparative analysis of statistical and machine learning methods for predicting faulty modules, Applied Soft Computing, 21, pp. 286-297.
https://doi.org/10.1016/j.asoc.2014.03.032

Di Nucci, D. et al. (2017) Dynamic selection of classifiers in Bug prediction: An adaptive method, IEEE Transactions on Emerging Topics in Computational Intelligence, 1(3), pp. 202-212.
https://doi.org/10.1109/TETCI.2017.2699224

Rathore, SS and Kumar, S. (2021) Software fault prediction based on the dynamic selection of learning technique: Findings from the Eclipse Project Study, Applied Intelligence, 51(12), pp. 8945-8960.
https://doi.org/10.1007/s10489-021-02346-x

S. A. Khan and Z. Ali Rana, Evaluating Performance of Software Defect Prediction Models Using Area Under Precision-Recall Curve (AUC-PR), 2019 2nd International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 2019, pp. 1-6.
https://doi.org/10.23919/ICACS.2019.8689135

Bowes, D., Hall, T. and Gray, D. (2013) DConfusion: A technique to allow Cross study performance evaluation of Fault Prediction Studies, Automated Software Engineering, 21(2), pp. 287-313.
https://doi.org/10.1007/s10515-013-0129-8

N. Saini, K. Bhandari and K. Kumar. (2021) Various Aspects of Software Fault Prediction: A Review, 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), Greater Noida, India, 2021, pp. 1625-1629.
https://doi.org/10.1109/ICAC3N53548.2021.9725649

Ferenc, R. et al. (2020) A public unified bug dataset for Java and its assessment regarding metrics and Bug Prediction, Software Quality Journal, 28(4), pp. 1447-1506.
https://doi.org/10.1007/s11219-020-09515-0

Jureczko, M., & Madeyski, L. (2010). Towards identifying software project clusters with regard to defect prediction. In Proceedings of the 6th International Conference on Predictive Models in Software Engineering, PROMISE '10 (pp. 9:1-9:10). New York: ACM.
https://doi.org/10.1145/1868328.1868342

T. Zimmermann, R. Premraj and A. Zeller, Predicting Defects for Eclipse, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007), Minneapolis, MN, USA, 2007, pp. 9-9.
https://doi.org/10.1109/PROMISE.2007.10

M. D'Ambros, M. Lanza and R. Robbes, An extensive comparison of bug prediction approaches, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), Cape Town, South Africa, 2010, pp. 31-41.
https://doi.org/10.1109/MSR.2010.5463279

Hall, T. et al. (2014) Some code smells have a significant but small effect on faults, ACM Transactions on Software Engineering and Methodology, 23(4), pp. 1-39.
https://doi.org/10.1145/2629648

Tóth, Z., Gyimesi, P. and Ferenc, R. (2016) A GitHub bug database of github projects and its application in Bug Prediction, Computational Science and Its Applications -- ICCSA 2016, pp. 625-638.
https://doi.org/10.1007/978-3-319-42089-9_44

Shurrab, S., Duwairi, R., Effect of Missing Data Treatment on the Predictive Accuracy of C4.5 Classifier, (2021) International Journal on Communications Antenna and Propagation (IRECAP), 11 (3), pp. 156-165.
https://doi.org/10.15866/irecap.v11i3.19721

K. Bhandari, K. Kumar, and A. Sangal, Data quality issues in software fault prediction: a systematic literature review, Artificial Intelligence Review, vol. 56, pp. 1-70, 2022.
https://doi.org/10.1007/s10462-022-10371-6

K. Zhao, Z. Xu, M. Yan, T. Zhang, L. Xue, M. Fan, and J. Keung, The impact of class imbalance techniques on crashing fault residence prediction models, Empirical Software Engineering, vol. 28, 2023.
https://doi.org/10.1007/s10664-023-10294-y

Abidi, M., Fizazi, H., Boudali, N., Clustering of Remote Sensing Data Based on Spherical Evolution Algorithm, (2021) International Review of Aerospace Engineering (IREASE), 14 (2), pp. 72-79.
https://doi.org/10.15866/irease.v14i2.19209

Rerhrhaye, F., Lahlouh, I., Ennaciri, Y., Benzazah, C., Akkary, A., Sefiani, N., New Solar MPPT Control Technique Based on Incremental Conductance and Multi-Objective Genetic Algorithm Optimization, (2022) International Journal on Energy Conversion (IRECON), 10 (3), pp. 70-78.
https://doi.org/10.15866/irecon.v10i3.22156

Azazy, N., Helmy, W., Hasanien, H., Optimal Siting and Sizing of DGs on Distribution Networks Using Grey Wolf Algorithm, (2021) International Journal on Energy Conversion (IRECON), 9 (3), pp. 113-124.
https://doi.org/10.15866/irecon.v9i3.20365

Jaber, A., Mohammed, K., Shalash, N., Optimization of Electrical Power Systems Using Hybrid PSO-GA Computational Algorithm: a Review, (2020) International Review of Electrical Engineering (IREE), 15 (6), pp. 502-511.
https://doi.org/10.15866/iree.v15i6.18599

Prasetyono, E., Mohammad, L., Dwi Murdianto, F., Performance of ACO-MPPT and Constant Voltage Method for Street Lighting Charging System, (2020) International Review of Electrical Engineering (IREE), 15 (3), pp. 235-244.
https://doi.org/10.15866/iree.v15i3.17309


Refbacks

  • There are currently no refbacks.



Please send any question about this web site to info@praiseworthyprize.com
Copyright © 2005-2024 Praise Worthy Prize