Universiti Teknologi Malaysia Institutional Repository

Remote protein homology detection and fold recognition using two-layer support vector machine classifiers

Muda, Hilmi M. and Saad, Puteh and Othman, Razib M. (2011) Remote protein homology detection and fold recognition using two-layer support vector machine classifiers. Computers in Biology and Medicine, 41 (8). pp. 687-699. ISSN 0010-4825

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1016/j.compbiomed.2011.06.004

Abstract

Remote protein homology detection and fold recognition refer to detection of structural homology in proteins where there are small or no similarities in the sequence. To detect protein structural classes from protein primary sequence information, homology-based methods have been developed, which can be divided to three types: discriminative classifiers, generative models for protein families and pairwise sequence comparisons. Support Vector Machines (SVM) and Neural Networks (NN) are two popular discriminative methods. Recent studies have shown that SVM has fast speed during training, more accurate and efficient compared to NN. We present a comprehensive method based on two-layer classifiers. The 1st layer is used to detect up to superfamily and family in SCOP hierarchy using optimized binary SVM classification rules. It used the kernel function known as the Bio-kernel, which incorporates the biological information in the classification process. The 2nd layer uses discriminative SVM algorithm with string kernel that will detect up to protein fold level in SCOP hierarchy. The results obtained were evaluated using mean ROC and mean MRFP and the significance of the result produced with pairwise t-test was tested. Experimental results show that our approaches significantly improve the performance of remote protein homology detection and fold recognition for all three different version SCOP datasets (1.53, 1.67 and 1.73). We achieved 4.19% improvements in term of mean ROC in SCOP 1.53, 4.75% in SCOP 1.67 and 4.03% in SCOP 1.73 datasets when compared to the result produced by well-known methods. The combination of first layer and second layer of BioSVM-2L performs well in remote homology detection and fold recognition even in three different versions of datasets.

Item Type:Article
Uncontrolled Keywords:bio-inspired kernel, fold recognition, remote protein homology detection, support vector machines, two-layer classifiers
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System
ID Code:29630
Deposited By: Yanti Mohd Shah
Deposited On:27 Mar 2013 00:27
Last Modified:25 Apr 2019 01:18

Repository Staff Only: item control page