Universiti Teknologi Malaysia Institutional Repository

A malicious URL detection framework using priority coefficient and feature evaluation

Rafsanjani, Ahmad Sahban (2023) A malicious URL detection framework using priority coefficient and feature evaluation. PhD thesis, Universiti Teknologi Malaysia.

[img] PDF
829kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

Malicious Uniform Resource Locators (URLs) are one of the major threats in cybersecurity. Cyber attackers spread malicious URLs to carry out attacks such as phishing and malware, which lead unsuspecting visitors into scams, resulting in monetary loss, information theft, and other threats to website users. At present, malicious URLs are detected using blacklist and heuristic methods, but these methods lack the ability to detect new and obfuscated URLs. Machine learning and deep learning methods have been seen as popular methods for improving the previous method to detect malicious URLs. However, these methods are entirely datadependent, and a large, updated dataset is necessary for the training to create an effective detection method. Besides, accuracy and detection mostly depend on the quality of training data. This research developed a framework to detect malicious URL based on predefined static feature classification by allocating priority coefficients and feature evaluation methods. The feature classification employed 39 classes of blacklist, lexical, host- based, and content-based features. A dataset containing 2000 real-world URLs was gathered from two popular phishing and malware websites, URLhaus and PhishTank. In the experiment, the proposed framework was evaluated with three supervised machine learning methods: Support Vector Machine (SVM), Random Forest (RF), and Bayesian Network (BN). The result showed that the proposed framework outperformed these methods. In addition, the proposed framework was benchmarked with three comprehensive malicious URL detection methods, which were Precise Phishing Detection with Recurrent Convolutional Neural Networks, Li, and URLNet in terms of accuracy and precision. The results showed that the proposed framework achieved a detection accuracy of 98.95% and a precision value of 98.60%. In sum, the developed malicious URL framework significantly improves the detection in terms of accuracy.

Item Type:Thesis (PhD)
Uncontrolled Keywords:Uniform Resource Locators (URLs), Support Vector Machine (SVM), Bayesian Network (BN)
Subjects:T Technology > T Technology (General)
Divisions:Razak School of Engineering and Advanced Technology
ID Code:102826
Deposited By: Widya Wahid
Deposited On:24 Sep 2023 03:20
Last Modified:24 Sep 2023 03:20

Repository Staff Only: item control page