Universiti Teknologi Malaysia Institutional Repository

Review of feature extraction approaches on biomedical text classification

Dollah, R. and Jafni, T. I. and Hashim, H. and Othman, M. S. and Rasib, A. W. (2020) Review of feature extraction approaches on biomedical text classification. International Journal of Advanced And Applied Sciences, 7 (4). pp. 1-8.

Full text not available from this repository.

Official URL: http://www.dx.doi.org/10.21833/ijaas.2020.04.001

Abstract

The overcoming volume of online biomedical literature causes congestion of data and difficulties in organizing these documents and also to retrieve the required documents from the database, especially in the Medline database. One of the solutions to surpass the overwhelming of documents is to apply classification. However, each document must be represented by a set of terminology or feature vectors. The identification of terminology or feature from biomedical literature is one of the most important and challenging tasks in text classification. This is due to a large number of new features and entities that appear in the biomedical domain. In addition, combining sets of features from different terminological resources leads to naming conflicts such as homonymous use of names and terminological ambiguities. Therefore, the purpose of this research is to investigate and evaluate the effective ways for extracting the relevant and meaningful features in order to increase the classification accuracy and improve the performance of web searches. Towards this effort, we conduct several classification experiments to evaluate and compare the effectiveness of feature extraction approaches for extracting the relevant and informative features from the biomedical literature. For our experiments, we use two different sets of features, which are a set of features that are extracted using the Genia tagger tool and set of features that are extracted by medical experts from Pusat Perubatan Universiti Kebangsaan Malaysia (PPUKM). The results show the performance of classification using features that are extracted by medical experts outperform the performance of classification using the Genia Tagger tool when applying feature selection method.

Item Type:Article
Uncontrolled Keywords:biomedical literature, feature extraction, feature selection
Subjects:Q Science > QA Mathematics
Divisions:Computing
ID Code:87028
Deposited By: Narimah Nawil
Deposited On:31 Oct 2020 12:16
Last Modified:31 Oct 2020 12:16

Repository Staff Only: item control page