Universiti Teknologi Malaysia Institutional Repository

An improved deep neural network algorithm for the prediction of limited proteolysis in native protein

Hashim, Haslina and Hong, Lim Lip and A. Samah, Azurah and Abdul Majid, Hairudin and Ali Shah, Zuraini and Azman, Nuraina Syaza and Azmi, Nur Sabrina (2022) An improved deep neural network algorithm for the prediction of limited proteolysis in native protein. International Journal of Innovative Computing, 12 (1). pp. 35-42. ISSN 2180-4370

[img] PDF
586kB

Official URL: http://dx.doi.org/10.11113/ijic.v12n1.351

Abstract

Protease is a proteolytic enzyme that hydrolyzes the amino acid where the cleavage only occurs at specific sites of the amino acid substrate. By discovering the nick site, the prediction on the function of proteases can be identified and enable humans to control the protein’s hydrolysis by their corresponding protease. This is an important process to control as it can help to control protein replication especially viral proteins. With the rise of computational methods in this era, mainly through the successful application of deep learning in various domains, the application of this method in biological data can help to improve predictions to support the experimental methods. Conventional techniques such as mass spectrometry and two-dimensional gel electrophoresis can be supported by computational methods by preparing predictions. Thus reducing the cost of experiment and time taken to identify and predict the protein proteolysis site. This study improves the deep learning algorithm by proposing the Hybrid model of Random Forest + Deep Neural Network (Hybrid RF+DNN) to classify proteolysis or nick sites. The classification in this study is compared with other machine learning algorithms such as Random Forest (RF), Support Vector Machine (SVM), and Deep Neural Network (DNN). The proposed method enhances the classification results in identifying the positive and negative nick sites. The RF is a feature-selector that gathers the most important feature before entering the DNN classifier. This approach reduces the data dimensionality and speeds up the execution time of the training process. The performance of the models was measured by confusion matrix, specificity, sensitivity, etc. However, the proposed method is not the best performer among the mentioned classifiers as the classifiers have obtained 0.64, 0.65, and 0.58 for Datasets A, B, and C, respectively. The proposed method may become the best performer as the parameter tuning is done more precisely, even after the feature selection by the RF algorithm. Thus, the proposed method with the enhancement appears to be an alternative to the researcher discovering the limited proteolysis or nick site.

Item Type:Article
Uncontrolled Keywords:Protease Nick Sites, Random Forest (RF), Support Vector Machine (SVM), Deep Neural Network (DNN), Hybrid model of Random Forest and Deep Neural Network (Hybrid RF DNN)
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:108824
Deposited By: Widya Wahid
Deposited On:09 Dec 2024 07:47
Last Modified:09 Dec 2024 07:47

Repository Staff Only: item control page