Universiti Teknologi Malaysia Institutional Repository

Missing data characteristics and the choice of imputation technique: an empirical study

Alade, Oyekale Abel and Sallehuddin, Roselina and Mohamed Radzi, Nor Haizan and Selamat, Ali (2020) Missing data characteristics and the choice of imputation technique: an empirical study. Advances in Intelligent Systems and Computing, 1073 . pp. 88-97. ISSN 2194-5357

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1007/978-3-030-33582-3_9

Abstract

One important characteristic of good data is completeness. Missing data is a major problem in the classification of medical datasets. It leads to incorrect classification of patients, which is dangerous to health management of patients. Many imputation techniques have been employed to solve this problem, but these techniques are without recourse to the characteristics that cause the missingness. In this paper, we investigated the causes of missing data in a medical dataset and proposed multiple imputation technique to solving the problem of missing data. A 5-fold-iteration multiple imputation was employed. The whole missing values in the dataset was regenerated 100%. The imputed datasets were validated using extreme learning machine (ELM) classifier. The results show improvement on the accuracy of the imputed datasets. The work can, however, be extended to compare the accuracy of the imputed datasets with different classifiers.

Item Type:Article
Uncontrolled Keywords:mechanism of missingness, missing data, multiple imputations
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:89957
Deposited By: Yanti Mohd Shah
Deposited On:31 Mar 2021 06:31
Last Modified:31 Mar 2021 06:31

Repository Staff Only: item control page