Mokaramian, Shahram (2013) Evaluation of machine learning techniques for imbalanced data in IDS. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computing.
|
PDF
428kB |
Official URL: http://dms.library.utm.my:8080/vital/access/manage...
Abstract
Network Intrusion Detection System (IDS) is an automated system that can detect a malicious traffic and it plays a critical role in a network. In recent years, machine learning algorithms have been developed and used to detect network intrusion. Most standard machine learning algorithms often give high overall accuracy. However, they favor on majority class when dealing with imbalanced data. Unfortunately, IDS deals with highly imbalanced data distribution and most machine learning algorithms have poor detection on R2L and U2R classes, which include malicious attacks. Therefore, it requires a resampling technique to balance the data. The purpose of this study is to investigate performance of three machine learning algorithms which are Support Vector Machine (SVM), Decision Tree (DT) and Fuzzy Classifier (FC) for imbalanced data in IDS and after the rebalanced the data which was achieved using Synthetic Minority Over-sampling TEchnique (SOMTE). The performance of the three machine learning algorithms was evaluated with the new rebalanced data. The benchmark DARPA KDDCup 1999 IDS dataset was used. SMOTE was implemented with two imbalance ratio, one is 1:4 another one is 1:1. After analysis the results of before and after resampling showed that FC performs better with imbalance ratio of 1:1. The accuracy of FC with balanced data was Normal traffic (99.19%), Denial of Service attacks (99.35%), Probe attacks (99.51%), Remote to Local attacks (99.67%) and User to Root attacks (99.41%). In addition, the data with imbalance ratio of 1:1 get the better results on all classes with these three machine learning algorithms.
Item Type: | Thesis (Masters) |
---|---|
Additional Information: | Thesis (Sarjana Sains Komputer (Keselamatan Maklumat)) - Universiti Teknologi Malaysia, 2013; Supersivor : Dr. Anazida Zainal |
Uncontrolled Keywords: | machine learning, computational learning theory |
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7885-7895 Computer engineer. Computer hardware |
Divisions: | Computing |
ID Code: | 37080 |
Deposited By: | Fazli Masari |
Deposited On: | 31 Mar 2014 01:45 |
Last Modified: | 29 Jun 2017 07:03 |
Repository Staff Only: item control page