Ali Albshayreh, Ali Otman (2015) Spam detection in email body using hybrid of artificial neural network and evolutionary algorithms. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computing.
|
PDF
937kB |
Official URL: http://dms.library.utm.my:8080/vital/access/manage...
Abstract
Spam detection is a significant problem that is considered by many researchers through various developed strategies. Creating a particular model to categorize the wide range of spam categories is complex; with understanding of spam types, which are always changing. In spam detection, low accuracy and the high false positive are substantial problems. So the trend to hire a global optimization algorithm is an appropriate way to resolve these problems due to its ability to create new solutions and non-compliance with local solutions. In this study, a hybrid machine learning approach inspired by Artificial Neural Network (ANN) and Differential Evolution (DE) are designed for effectively detect the spams. Comparisons have been done between ANN-DE with Genetic Algorithm (GA) and ANN-DE with InfoGain algorithm to show which approach has the best performance in spam detection. Spambase dataset of 4061 E-mail in which 1813 were spam (39.40%) and 2788 were non-spam (59.60%) were used to training and testing on these algorithms. The popular performance measure is a classification accuracy, which deals with false positive, false negative, accuracy, precision, and recall. These metrics were used for performance evaluation on the hybrid of ANN-DE with GA and InfoGain algorithm as feature selection algorithms. Performance of ANN-DE with GA and ANN-DE with InfoGain are compared. The experimental results show that the proposed hybrid technique of ANN-DE and GA gives better result with 93.81% accuracy compared to ANN-DE and InfoGain with 93.28% accuracy. The results recommend that the effectiveness of proposed ANN-DE with GA is promising and this study provided a new method to practically train ANN for spam detection.
Item Type: | Thesis (Masters) |
---|---|
Additional Information: | Thesis (Sarjana Sains Komputer (Keselamatan Maklumat)) - Universiti Teknologi Malaysia, 2015; Supervisor : Dr. Maheyzah Md. Siraj |
Uncontrolled Keywords: | genetic algorithm (GA), artificial neural network (ANN) |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Computing |
ID Code: | 53536 |
Deposited By: | Fazli Masari |
Deposited On: | 16 Mar 2016 01:07 |
Last Modified: | 19 Jul 2020 07:40 |
Repository Staff Only: item control page