Universiti Teknologi Malaysia Institutional Repository

Optimized subtractive clustering for cluster-based compound selection

Kuik, Sok Ping (2006) Optimized subtractive clustering for cluster-based compound selection. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System.

[img]
Preview
PDF
112kB

Abstract

Compound selection method is important in drug discovery especially in lead identification process. Finding the best method in the compound selection has become a need to pharmaceutical chemistry because of the increasing number of chemical compound to be screened. One of the best and widely used methods in compound selection is cluster-based selection where the compound datasets are grouped into clusters and representative compounds are selected from each cluster. Among all fuzzy clustering method, fuzzy c-means using Euclidean Distance measures is better used in compound selection. Fuzzy c-means clustering gives the best result in intermolecular dissimilarity; however it shows poor results of separation of active/inactive structure. The research focused on the subtractive clustering where the effectiveness of the clusters produced with regard to compound selection is analyzed and compared with other conventional cluster-based compound selection method. Subtractive clustering has been chosen because it considers each data point as a potential cluster center and defines a measure of the potential of data point and it also resolves the problem of how many clusters need to be taken for the data. Subtractive clustering will produce the number of cluster automatically together with the value of radii cluster and squash factor. The results from subtractive clustering are compared to fuzzy c-means method and K-means. The analysis shows that subtractive clustering gives the worst result in separation of active/inactive structure among the fuzzy c-means and K-means. K-means produced the highest proportion of active structure in this research. For subtractive clustering, good values of squash factor are between 0.375 and 0.45 and the radii cluster from 0.35 to 0.45 because they always hit the highest proportion of active structures.

Item Type:Thesis (Masters)
Additional Information:Thesis (Master of Science (Computer Science)) - Universiti Teknologi Malaysia, 2006; Supervisor : Assoc. Prof. Dr. Naomie Binti Salim
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System
ID Code:4896
Deposited By: Widya Wahid
Deposited On:29 Feb 2008 05:02
Last Modified:28 Feb 2018 06:49

Repository Staff Only: item control page