Universiti Teknologi Malaysia Institutional Repository

An adaptive behavioral-based incremental batch learning malware variants detection model using concept drift detection and sequential deep learning

Darem, A. A. and Ghaleb, F. A. and Al-Hashmi, A. A. and Abawajy, J. H. and Alanazi, S. M. and Rezami, A. Y. A. (2021) An adaptive behavioral-based incremental batch learning malware variants detection model using concept drift detection and sequential deep learning. IEEE Access, 9 . ISSN 2169-3536

[img]
Preview
PDF
2MB

Official URL: http://dx.doi.org/10.1109/ACCESS.2021.3093366

Abstract

Malware variants are the major emerging threats that face cybersecurity due to the potential damage to computer systems. Many solutions have been proposed for detecting malware variants. However, accurate detection is challenging due to the constantly evolving nature of the malware variants that cause concept drift. Existing malware detection solutions assume that the mapping learned from historical malware features will be valid for new and future malware. The relationship between input features and the class label has been considered stationary, which doesn't hold for the ever-evolving nature of malware variants. Malware features change dynamically due to code obfuscations, mutations, and the modification made by malware authors to change the features' distribution and thus evade the detection rendering the detection model obsolete and ineffective. This study presents an Adaptive behavioral-based Incremental Batch Learning Malware Variants Detection model using concept drift detection and sequential deep learning (AIBL-MVD) to accommodate the new malware variants. Malware behaviors were extracted using dynamic analysis by running the malware files in a sandbox environment and collecting their Application Programming Interface (API) traces. According to the malware first-time appearance, the malware samples were sorted to capture the malware variants' change characteristics. The base classifier was then trained based on a subset of historical malware samples using a sequential deep learning model. The new malware samples were mixed with a subset of old data and gradually introduced to the learning model in an adaptive batch size incremental learning manner to address the catastrophic forgetting dilemma of incremental learning. The statistical process control technique has been used to detect the concept drift as an indication for incrementally updating the model as well as reducing the frequency of model updates. Results from extensive experiments show that the proposed model is superior in terms of detection rate and efficiency compared with the static model, periodic retraining approaches, and the fixed batch size incremental learning approach. The model maintains an average of 99.41% detection accuracy of new and variants malware with a low updating frequency of 1.35 times per month.

Item Type:Article
Uncontrolled Keywords:adaptive incremental batch learning, concept drift detection, deep learning
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:94861
Deposited By: Narimah Nawil
Deposited On:29 Apr 2022 21:54
Last Modified:29 Apr 2022 21:54

Repository Staff Only: item control page