Universiti Teknologi Malaysia Institutional Repository

Feasibility study of fuzzy clustering techniques in chemical database for compound classification

Dollah @ Md. Zain, Rozilawati and Bakri, Aryati and Bahari, Mahadi and Salim, Naomie (2006) Feasibility study of fuzzy clustering techniques in chemical database for compound classification. Project Report. Faculty of Computer Science and Information System, Skudai Johor. (Unpublished)

[img] PDF (full text)


Compound selection methods are important in drug discovery especially in lead identification process. Finding the best method in compound selection has become a need to the pharmaceutical industry because of the increasing number of chemical compound to be screened. One of the best and widely used methods in compound selection is cluster-based selection where the compound datasets are grouped into clusters and representative compounds are selected from each cluster. Non-overlapping methods, such as Ward’s clustering method, have been widely used and it was agreed as the most efficient clustering method in compound selection. However, little focus has been given to overlapping method in compound selection or even in lead identification process. The research focused on the fuzzy c-means clustering where the effectiveness of the clusters produced with regard to compound selection is analyzed and compared with other conventional cluster-based compound selection method. Fuzzy c-means have been chosen because it produces clusters by identifying the cluster centroid and their corresponding degree of membership, therefore the compounds may belong to more than one cluster. The results from fuzzy c-means method are compared to Ward’s clustering method and also to the results from the fuzzification of Ward’s cluster. The analysis shows that fuzzy c-means clustering gives the best result in intermolecular dissimilarity; however it shows poor results of separation of active/inactive structure.

Item Type:Monograph (Project Report)
Uncontrolled Keywords:Fuzzy clustering techniques, chemical database, compound classification
Subjects:Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
Divisions:Computer Science and Information System
ID Code:4402
Deposited By: Azrin Ariffin
Deposited On:25 Jun 2008 03:07
Last Modified:01 Jun 2010 03:17

Repository Staff Only: item control page