Universiti Teknologi Malaysia Institutional Repository

An intelligent feature selection approach based on a novel improve binary sparrow search algorithm for COVID-19 classification.

Mahdi, Amir Yasseen and Yuhaniz, Siti Sophiayati (2023) An intelligent feature selection approach based on a novel improve binary sparrow search algorithm for COVID-19 classification. International Journal of Intelligent Engineering and Systems, 16 (4). pp. 39-59. ISSN 2185-310X

[img] PDF
1MB

Official URL: http://dx.doi.org/10.22266/ijies2023.0831.04

Abstract

This paper proposes an improved binary sparrow search algorithm (IBSSA) as a search strategy within the feature selection (FS) methods. Its main objective is to use clinical texts to improve COVID-19 patient categorization. The constant need for an efficient FS system and the favorable outcomes of swarming behavior in numerous optimization situations drove our efforts to develop a novel FS strategy. Additionally, clinical text data are frequently highly dimensional and contain uninformative features, which have a major impact on the classifier's accuracy, making FS a key machine-learning step in data pre-processing to reduce data dimensionality. The bi-stage FS approach is used in this work to elect the features. At the initial stage, we employed a term weighting scheme (TWS) that assigned a weighted score to each feature by measuring the significance of the features obtained from the pre-processing model using a new weight calculation method called root term frequency-core-inverse exponential frequency (RTF-C-IEF). Next, finding the most relevant and almost optimal feature subset for COVID-19 illness diagnosis is done in the second stage using a freshly developed methodology that was inspired by the way sparrow’s behavior. The suggested modification method for the sparrow’s algorithm is composed of several stages of advancement. The main objectives are to promote the exploration of the search space and increase the algorithm's variability. In order to evaluate the proposed model, various classifiers were employed on two datasets, each of which had 1446 and 3053 cases, respectively. The experimental and statistical results demonstrate that the proposed IBSSA is significantly superior compared to other comparative optimization algorithms, and it successfully upgrades the shortcomings of the original SSA. Moreover, the IBSSA has the highest accurate performance when compared to other rivals by the SVM classifier, Where, average removed features are 77.99% and 83.5%, with improvement percentages by F1-scores: 84.95% and 95.94 % for both datasets, respectively.

Item Type:Article
Uncontrolled Keywords:Binary sparrow search algorithm; Clinical text classification; COVID-19; Feature selection; Natural language processing; Optimization.
Subjects:H Social Sciences > H Social Sciences (General)
H Social Sciences > HA Statistics
Divisions:Razak School of Engineering and Advanced Technology
ID Code:105735
Deposited By: Muhamad Idham Sulong
Deposited On:13 May 2024 07:26
Last Modified:13 May 2024 07:26

Repository Staff Only: item control page