Universiti Teknologi Malaysia Institutional Repository

A comparative analysis of missing data imputation techniques on sedimentation data

Loh, Wing Son and Ling, Lloyd and Chin, Ren Jie and Lai, Sai Hin and Loo, Kar Kuan and Seah, Choon Sen (2024) A comparative analysis of missing data imputation techniques on sedimentation data. Ain Shams Engineering Journal, 15 (6). pp. 1-20. ISSN 2090-4479

[img] PDF
1MB

Official URL: http://dx.doi.org/10.1016/j.asej.2024.102717

Abstract

Sediment data pertains to various hydrological variables with complex sediment hydrodynamics such as sedimentation rates which are often incompletely presented. Thus, the availability of sedimentation data is of utmost necessity for data accessibility. A comparative analysis on the missing fine sediment data imputation performance was made based on four different techniques, namely the k-Nearest Neighbourhood (k-NN), Support Vector Regression (SVR), Multiple Regression (MR), and Artificial Neural Network (ANN), under the single imputation (SI) and multiple imputation (MI) regimes. Across different missing data proportions (10%-50%), the ANN demonstrated optimal results with consistent performance metrics recorded over both SI and MI regimes. For the highest missing data proportion (50%), the ANN presented the best imputation performance with a reported root mean squared error (RMSE) 0.000882, mean absolute error (MAE) 0.000595, coefficient of determination (R2) 71%, and Kling-Gupta Efficiency (KGE) 72%. The imputation performance ranking is as follows: ANN, SVR, MR, and k-NN.

Item Type:Article
Uncontrolled Keywords:artificial neural network (ANN), Fine sediment, Imputation techniques, Missing data, Sedimentation
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:109035
Deposited By: Widya Wahid
Deposited On:28 Jan 2025 04:13
Last Modified:28 Jan 2025 04:13

Repository Staff Only: item control page