Universiti Teknologi Malaysia Institutional Repository

On the significance of topological-indices based non-binary molecular similarity measures

Salim, Naomie and Holliday, John and Willett, Peter (2004) On the significance of topological-indices based non-binary molecular similarity measures. Sains Malaysiana, 33 (2). pp. 157-172. ISSN 0126-6039

Full text not available from this repository.

Official URL: http://www.ukm.my/jsm/english_journals/vol33num2_2...

Abstract

This paper describes experiments to study on how well the whole range of topological indices-based non-binary similarity values represents the physicochemical similarities between compounds. Measured log P values have been compared with the log P values predicted from compounds at different range of similarities calculated based on various topological indices of the compounds. Analysis shows that the non-binary Cosine, Simpson and Pearson coefficients might give misleading results when certain compounds are compared. Similarity values involving 1% most similar compounds based on the non-binary Tanimoto or Euclidean coefficients has been found to be able to represent physicochemical similarities between the molecules compared. Therefore, for searches requiring around 1% most similar compounds, rational selection methods based on the non-binary Tanimoto or Euclidean coefficients are likely to produce better results than random selection. Similarity values involving 5% most dissimilar compounds based on the non-binary Tanimoto coefficients has also been found to be able to represent physicochemical dissimilarities between the molecules compared. Therefore, for diverse selection requiring less than 5% most dissimilar compounds, rational selection methods based on the non-binary Tanimoto coefficient is likely to produce better results than random selection. However, in both focused and diverse selection using the coefficients mentioned, as more and more compounds are selected, the selection becomes more and more like random selection in terms of physicochemical properties similarity and dissimilarity.

Item Type:Article
Uncontrolled Keywords:non-binary Tanimoto, physicochemical, Euclidean coefficients
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System
ID Code:28187
Deposited By: Yanti Mohd Shah
Deposited On:18 Sep 2012 06:14
Last Modified:30 Nov 2018 07:07

Repository Staff Only: item control page