Universiti Teknologi Malaysia Institutional Repository

An evaluation on the efficiency of hybrid feature selection in spam email classification

Mohamad, M. and Selamat, A. (2015) An evaluation on the efficiency of hybrid feature selection in spam email classification. In: 2nd International Conference on Computer, Communications, and Control Technology, I4CT 2015, 21-23 Apr 2015, Kuching, Sarawak.

[img]
Preview
PDF
764kB

Official URL: http://www.dx.doi.org/10.1109/I4CT.2015.7219571

Abstract

In this paper, a spam filtering technique, which implement a combination of two types of feature selection methods in its classification task will be discussed. Spam, which is also known as unwanted message always floods our electronic mail boxes, despite a spam filtering system provided by the email service provider. In addition, the issue of spam is always highlighted by Internet users and attracts many researchers to conduct research works on fighting the spam. A number of frameworks, algorithms, toolkits, systems and applications have been proposed, developed and applied by researchers and developers to protect us from spam. Several steps need to be considered in the classification task such as data pre-processing, feature selection, feature extraction, training and testing. One of the main processes in the classification task is called feature selection, which is used to reduce the dimensionality of word frequency without affecting the performance of the classification task. In conjunction with that, we had taken the initiative to conduct an experiment to test the efficiency of the proposed Hybrid Feature Selection, which is a combination of Term Frequency Inverse Document Frequency (TFIDF) with the rough set theory in spam email classification problem. The result shows that the proposed Hybrid Feature Selection return a good result.

Item Type:Conference or Workshop Item (Paper)
Uncontrolled Keywords:algorithm, feature selection, filtering
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:59143
Deposited By: Haliza Zainal
Deposited On:18 Jan 2017 01:50
Last Modified:30 Sep 2021 05:55

Repository Staff Only: item control page