Universiti Teknologi Malaysia Institutional Repository

Hybrid PCA-ILGC clustering approach for high dimensional data

Musdholifah, Aina and Mohd. Hashim, Siti Zaiton and Ngah, Razali (2012) Hybrid PCA-ILGC clustering approach for high dimensional data. In: 2012 IEEE International Conference on Systems, Man & Cybernetics, 14-17 Oct, 2012, Seoul, South Korea.

Full text not available from this repository.

Official URL: https://ieeexplore.ieee.org/document/6377760

Abstract

The availability of high dimensional dataset that incredible growth, imposes insufficient conventional approaches to extract hidden useful information. As a result, today researchers are challenged to develop new techniques to deal with massive high dimensional data that has not only in term of number of data but also in the number of attributes. In order to improve effectiveness and accuracy of mining task on high dimensional data, an efficient dimensionality reduction method should be executed in data preprocessing stage before clustering technique is applied. Many clustering algorithms has been proposed and used to discover useful information from a dataset. Iterative Local Gaussian Clustering (ILGC) is a simple density based clustering technique that has successfully discovered number of clusters represented in the dataset. In this paper we proposed to use the Principal Component Analysis (PCA) method to preprocess the data prior to ILGC clustering in order to simplify the analysis and visualization of multi dimensional data set. The proposed approach is validated with benchmark classification datasets. In addition, the performance of proposed hybrid PCA-ILGC clustering approach is compared to original ILGC, basic k-means and hybridized k-means. The experimental results indicate that the proposed approach is capable to obtain clusters with higher accuracy, and time taken to process the data was decreased.

Item Type:Conference or Workshop Item (Paper)
Uncontrolled Keywords:dimensionality reduction, iterative local Gaussian clustering algorithm, principal component analysis
Divisions:Computer Science and Information System
ID Code:34105
Deposited By: Liza Porijo
Deposited On:17 Aug 2017 03:35
Last Modified:09 Oct 2018 06:59

Repository Staff Only: item control page