Universiti Teknologi Malaysia Institutional Repository

New approaches in estimating linear regression model parameters in the presence of multicollinearity and outliers

Al-Mash, Mohammad Sabry Abo (2017) New approaches in estimating linear regression model parameters in the presence of multicollinearity and outliers. Masters thesis, Universiti Teknologi Malaysia, Faculty of Science.

[img]
Preview
PDF
1MB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

In multiple linear regression models, the ordinary least squares (OLS) method has been the most popular technique for estimating parameters of model due to its optimal properties and ease of calculation. OLS estimator may fail when the assumption of independence is violated. This assumption can be violated when there are correlations between the exploratory variables. In this situation, the data is said to contain multicollinearity and eventually will mislead the inferential statistics. However, the problem becomes more complicated when there are abnormal observational data known as outliers. It is now evident that presence of outliers has a serious threat on model with multicollinearity. In this research new procedures on how to improve the parameter estimation method in the presence of multicollinearity and outliers are put forward. The Principal Component Regression (PCR) and Ridge Regression (RR) individually are not resistant to outliers. The results of the research have showed that even if the PCR and RR produced good results with multicollinearity model, it may fail in the presence of outliers. The motive behind this research to find new procedures which are best with high break down point to estimate the model of regression with multicollinearity and outliers characteristics. The proposed methods are called Principal Component regression with Least Trimmed Squares (LTS) based on Tukey bisquare weighted (RWPCLTS) and Principal Component regression with Least Median Squares (LMS) based on Tukey bisquare weighted (RWPCLMS). Empirical applications of cigarette data according to its weight, tar, nicotine, and carbon monoxide contents for different brand of domestic cigarette were used to compare the performance between RWPCLTS and RWPCLMS with the existing methods of PCR and RR methods. A comprehensive simulation study evaluates the impact of multicollinearity and outliers on the proposed methods and existing methods. The considered percentages of outliers in the simulation are 0%, 5%, 10%, 15% and 20%. A selection criterion is proposed based on the best model with bias and root mean squares error for the simulated data and low standard error for real data. Results for both real data and simulation study suggest that the proposed criterion is effective for RWPCLTS and RWPCLMS in multicollinearity and outliers. Moreover, for both methods, the RWPCLTS tend to be the best followed by RWPCLMS when multicollinearity and outliers are present. This research shows the ability of the computationally intense method and viability of combining weighting procedures namely robust LTS-estimation or LMS-estimation and multicollinearity diagnostic methods of PC to achieve accurate regression model. In conclusion, the proposed methods are able to improve the parameter estimation of linear regression by enhancing the existing methods to handle the problem of multicollinearity and outliers in the data set. This improvement will help the analyst to choose the best estimation method in order to produce the most accurate regression model in the presence of multicollinearity and outliers.

Item Type:Thesis (Masters)
Additional Information:Thesis (Sarjana Sains (Matematik)) - Universiti Teknologi Malaysia, 2017; Supervisor : Prof. Dr. Robiah Adnan
Uncontrolled Keywords:ordinary least squares (OLS), multicollinearity
Subjects:Q Science > QA Mathematics
Divisions:Science
ID Code:78208
Deposited By: Widya Wahid
Deposited On:28 Jul 2018 06:26
Last Modified:28 Jul 2018 06:26

Repository Staff Only: item control page