Universiti Teknologi Malaysia Institutional Repository

Detection of multiple outliners in linear regression using nonparametric methods

Adnan, Robiah (2004) Detection of multiple outliners in linear regression using nonparametric methods. Project Report. Universiti Teknologi Malaysia. (Unpublished)

[img] PDF
33MB

Abstract

There have been considerable interest in recent years in the detection and accommodation of multiple outliers in linear regression. However, most of them are complicated and unappealing to users with no mathematical background. The clustering algorithm from Sebert et al. (1998) is discussed and used since it is easy to understand with interesting proposed approach and have a good performance in detecting the presence of outliers. Generally, method proposed by Sebert et al. (1998) is based on the use of single linkage clustering algorithm with the Euclidean distances to cluster the points in the plots of standard predicted versus residuals values from a linear regression model. The predicted and residual values are obtained from an ordinary least squares fit of the data. The algorithm is described and is shown to perform well on classic multiple outlier data sets. A modification is done to the Sebert’s method by replacing the least squares (LS) with two robust estimators. Method 1 is a modification of Sebert’s method where the list squares (LS) fit is replaced by the least median of squares (LMS) fit while Method 2 is a modification of Sebert’s method where the least squares (LS) fit is replaced by the least trimmed of squares (LTS) fit. This reseach also provides a comparison between these three procedure to detect multiple outliers. A Monte Carlo simulations study was used to evaluate the effectiveness of these three procedures. All simulations and calculation were done using statistical package S-PLUS 2000. REFERENCES Agullo, J. (2000). New Algorithms for Computing the Least Trimmed Squares Regression Estimator. Computational Statistics and Data Analysis 36. 425-439. Aldenderfer M.S and Blashfield R.K.(1984). Cluster Analysis. USA: Sage Publications. Atkinson, A.C. (1986). Comment on ‘Influential Observations, High Leverage Points, and Outliers in Linear regression’. Statistical Science I. 397-402.

Item Type:Monograph (Project Report)
Uncontrolled Keywords:Detection of Multiple Outliners; Linear Regression; Nonparametric Methods
Subjects:Q Science > QA Mathematics
Divisions:Science
ID Code:2997
Deposited By: Nor Azlin Nordin
Deposited On:22 May 2007 07:08
Last Modified:01 Aug 2017 01:06

Repository Staff Only: item control page