Universiti Teknologi Malaysia Institutional Repository

Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection

Ang, J. C. and Mirzal, A. and Haron, H. and Hamed, H. N. A. (2016) Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 13 (5). pp. 971-989. ISSN 1545-5963

Full text not available from this repository.

Official URL: https://www.scopus.com/inward/record.uri?eid=2-s2....

Abstract

Recently, feature selection and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as gene expression microarray data. Gene expression microarray data comprises up to hundreds of thousands of features with relatively small sample size. Because learning algorithms usually do not work well with this kind of data, a challenge to reduce the data dimensionality arises. A huge number of gene selection are applied to select a subset of relevant features for model construction and to seek for better cancer classification performance. This paper presents the basic taxonomy of feature selection, and also reviews the state-of-The-Art gene selection methods by grouping the literatures into three categories: supervised, unsupervised, and semi-supervised. The comparison of experimental results on top 5 representative gene expression datasets indicates that the classification accuracy of unsupervised and semi-supervised feature selection is competitive with supervised feature selection.

Item Type:Article
Uncontrolled Keywords:Classification (of information), Clustering algorithms, Data handling, Data mining, Gene expression, Genes, Cancer classification, Classification accuracy, Dimensionality reduction, Gene expression datasets, Gene expression microarray, Semi-supervised, Supervised, Unsupervised, Feature extraction, genetic marker, tumor protein, algorithm, automated pattern recognition, comparative study, data mining, evaluation study, genetic database, genetic marker, genetic predisposition, genetics, human, machine learning, Neoplasms, procedures, reproducibility, sensitivity and specificity, tumor gene, validation study, Algorithms, Data Mining, Databases, Genetic, Genes, Neoplasm, Genetic Markers, Genetic Predisposition to Disease, Humans, Machine Learning, Neoplasm Proteins, Neoplasms, Pattern Recognition, Automated, Reproducibility of Results, Sensitivity and Specificity
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:72142
Deposited By: Fazli Masari
Deposited On:23 Nov 2017 06:19
Last Modified:23 Nov 2017 06:19

Repository Staff Only: item control page