Universiti Teknologi Malaysia Institutional Repository

Linear regression for data having multicollinearity, heteroscedasticity and outliers

Rasheed, Bello AbdulKadiri (2017) Linear regression for data having multicollinearity, heteroscedasticity and outliers. PhD thesis, Universiti Teknologi Malaysia, Faculty of Science.

[img]
Preview
PDF
1MB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

Evaluation of regression model is very much influenced by the choice of accurate estimation method since it can produce different conclusions from the empirical results. Thus, it is important to use appropriate estimation method in accordance with the type of statistical data. Although reliable for a single or a few outliers, standard diagnostic techniques from wild bootstrap fit can fail while the existing robust wild bootstrap based on MM-estimator is not resistant to high leverage points. The presence of high leverage points introduces multicollinearity while the MM-estimator is also not resistant to the presence of multicollinearity in the data. This research proposes new methods that deal with heteroscedasticity, multicollinearity, outliers and high leverage points more effectively than currently published methods. The proposed methods are called modified robust wild bootstrap, modified robust principal component (PC) with wild bootstrap and modified robust partial least squares (PLS) with wild bootstrap estimations. These methods are based on weighted procedures that incorporate generalized M-estimator (GM-estimator) with initial and scale estimate using S-estimator and MM-estimator. In addition, the multicollinearity diagnostics procedures of PC and PLS were also used together with the wild bootstrap sampling procedure of Wu and Liu. Empirical applications of data for national growth, income per capital data of the Organisation of Economic Community Development (OECD) countries and tobacco data were used to compare the performance between wild bootstrap, robust wild bootstrap, modified robust wild bootstrap, modified robust PC with wild bootstrap and modified robust PLS with wild bootstrap methods. A comprehensive simulation study evaluates the impacts of heteroscedasticity, multicollinearity outliers and high leverage points on numerous existing methods. A selection criterion is proposed based on the best model with bias and root mean squares error for the simulated data and low standard error for real data. Results for both real data and simulation study suggest that the proposed criterion is effective for modified robust wild bootstrap estimation in heteroscedasticity data with outliers and high leverage points. On the other hand, the modified robust PC with wild bootstrap estimation and modified robust PLS with wild bootstrap estimation is more effective in multicollinearity, heteroscedasticity, outliers and high leverage points. Moreover, for both methods, the modified robust sampling procedure of Liu based on Tukey biweight with initial and scale estimate from MM-estimator tend to be the best. While the best method for data with multicollinearity, heteroscedasticity, outliers and high leverage points is the modified robust PC with wild bootstrap estimation. This research shows the ability of the computationally intense method and viability of combining three different weighting procedures namely robust GM-estimation, wild bootstrap and multicollinearity diagnostic methods of PLS and PC to achieve accurate regression model. In conclusion, this study is able to improve parameter estimation of linear regression by enhancing the existing methods to consider the problem of multicollinearity, heteroscedasticity, outliers and high leverage points in the data set. This improvement will help the analyst to choose the best estimation method in order to produce the most accurate regression model.

Item Type:Thesis (PhD)
Additional Information:Thesis (Ph.D (Matematik)) - Universiti Teknologi Malaysia, 2017; Supervisors : Assoc. Prof. Dr. Robiah Adnan, Dr. Seyed Ehsan Saffari
Subjects:Q Science > QA Mathematics
Divisions:Science
ID Code:84005
Deposited By: Fazli Masari
Deposited On:31 Oct 2019 10:10
Last Modified:05 Nov 2019 04:33

Repository Staff Only: item control page