Universiti Teknologi Malaysia Institutional Repository

An ensemble model to improve effort estimation accuracy for software development

Mahmood, Yasir (2022) An ensemble model to improve effort estimation accuracy for software development. PhD thesis, Universiti Teknologi Malaysia, Razak Faculty of Technology & Informatics.

[img] PDF
557kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

In recent years, due to significant evolution in adopting new technologies and development methodologies in the field of software engineering, there is an increased requirement to have an accurate effort estimation model that can cater for the needs of the continually growing software industry. Accurate effort estimation model is an essential feature of software engineering for effective planning, controlling and on-time delivering quality software projects within budget. In the last few decades, several models and practices of estimating the software effort have evolved, but it is still an essentially unresolved problem. One of the main reasons for inaccuracy is due to ineffective use of estimation models. Nevertheless, there is no proven software estimation model that can be used continuously in various situations to accurately estimate the software effort. In software development, it is difficult to accurately estimate the amount of work required to develop a software system of which suitable estimation model is a major concern. The over-estimation may result in a lost bid while under-estimation may fail the project. Consequently, the inaccuracy in estimating the software effort may result in serious consequences for developers and customers; resulting in disappointment, inaccurate estimation and hence, contribute to either low-quality project, team frustration or cost overrun. The main aim of this research is to optimize the estimation accuracy performance of software development effort using an ensemble technique. In this research, a novel software effort predictive model is proposed in which it incorporates techniques such as 1) Use Case Points (UCP), 2) Expert Judgement, and 3) Case-Based Reasoning as base models to create an ensemble. In this model, a feature importance selection technique (Extra Tree Classifier) and K-Nearest Neighbour machine learning algorithm are applied to identify the most relevant features from the UCP benchmark dataset and to assess project similarity respectively. Finally, the effort of the individual base models is ensembled using linear combination methods. This research is conducted through primary (a multi-case study involving software companies and university students’ projects), and secondary case studies to make an ensemble model. To show the accuracy, reliability and applicability of the proposed model, the software projects from primary studies as case selections are selected by applying a quantitative approach through experiments, industrial experts, archival data about estimates and evaluation metrics. The results of this research revealed that in comparison to UCP, expert judgement, and CBR techniques, the ensemble technique produced 15.9%, 14.6 %, and 14.6 % Mean Magnitude of Relative Error; 20.6 %, 14 %, and 1% Mean Magnitude of Error Relative; 10.94 %, 14.53 %, and 1.1 % PRED (25) accuracy improvement. The proposed ensemble model can be used by software development firms and practitioners as an instrument to accurately estimate the effort required to develop new software projects at an earlier stage.

Item Type:Thesis (PhD)
Uncontrolled Keywords:estimation model, software development, under-estimation, over-estimation
Subjects:T Technology > T Technology (General)
Divisions:Razak School of Engineering and Advanced Technology
ID Code:102435
Deposited By: Yanti Mohd Shah
Deposited On:28 Aug 2023 06:36
Last Modified:28 Aug 2023 06:36

Repository Staff Only: item control page