Universiti Teknologi Malaysia Institutional Repository

Clustering chemical data set using particle swarm optimization based algorithm

Triyono, Triyono (2008) Clustering chemical data set using particle swarm optimization based algorithm. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System.

[img] PDF
Restricted to Repository staff only

372Kb
[img] PDF (Abstract)
61Kb
[img] PDF (Table Of Content)
62Kb
[img] PDF (1st Chapter)
76Kb

Abstract

Clustering is the process of organizing similar objects into groups, with its main objective is to organize a collection of data items into some meaningful groups. Generally, clustering is the most suitable approach in dealing with huge amount dataset with higher resemblance such as chemical database. The chemical data sets contain a huge number of compounds and knowledge of the physiochemical properties. The biological activities of these compounds have a large significance in the process of designing and discovering new drugs. Many algorithms had been applied to cluster chemical data set such as Ward’s algorithm. In this study, Particle Swarm Optimization (PSO) based clustering algorithm is exploited to optimize the results of other clustering algorithm such as K-means. Two chemical data sets were used and downloaded from MDDR (MDL Drug Database Report). The main difference between these two data sets is measured in terms of the similarities quantify of bioactivities between active compounds. The results are compared with Ward’s algorithm in terms of proportion actives percentage in active clusters are. We found that PSO algorithm reveals better performance than Ward’s algorithm on continuous data format; however for binary data format, Ward’s algorithm outperforms arrogantly.

Item Type:Thesis (Masters)
Additional Information:Thesis (Sarjana Sains (Komputer Sains)) - Universiti Teknologi Malaysia, 2008; Supervisor : Assoc. Prof. Dr. Naomie binti Salim
Uncontrolled Keywords:clustering, collection of data items, chemical database, physiochemical properties
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System (Formerly known)
ID Code:9867
Deposited By: Ms Zalinda Shuratman
Deposited On:18 May 2010 09:42
Last Modified:11 Sep 2012 09:14

Repository Staff Only: item control page