Reinforcement learning for an efficient and effective malware investigation during cyber incident response

Dipo Dunsin , Mohamed Chahine Ghanem , Karim Ouazzane , Vassil Vassilev

High-Confidence Computing ›› 2025, Vol. 5 ›› Issue (3) : 100299

PDF (3891KB)
High-Confidence Computing ›› 2025, Vol. 5 ›› Issue (3) : 100299 DOI: 10.1016/j.hcc.2025.100299
Research Articles
research-article

Reinforcement learning for an efficient and effective malware investigation during cyber incident response

Author information +
History +
PDF (3891KB)

Abstract

The ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature-based methods, supported by RL through a unified MDP model, which breaks down malware analysis into distinct states and actions. This optimisation enhances the identification and classification of malware variants. The framework employs Q-learning and other techniques to boost the speed and accuracy of detecting new and unknown malware, outperforming traditional methods. We tested the experimental framework across multiple virtual environments infected with various malware types. The RL agent collected forensic evidence and improved its performance through Q-tables and temporal difference learning. The epsilon-greedy exploration strategy, in conjunction with Q-learning updates, effectively facilitated transitions. The learning rate depended on the complexity of the MDP environment: higher in simpler ones for quicker convergence and lower in more complex ones for stability. This RL-enhanced model significantly reduced the time required for post-incident malware investigations, achieving a high accuracy rate of 94% in identifying malware. These results indicate RL’s potential to revolutionise post-incident forensics investigations in cybersecurity. Future work will incorporate more advanced RL algorithms and large language models (LLMs) to further enhance the effectiveness of malware forensic analysis.

Keywords

Cyber incident / Digital forensics / Artificial intelligence / Reinforcement learning / Markov Chain / MDP / DFIR / Malware / Incident response

Cite this article

Download citation ▾
Dipo Dunsin, Mohamed Chahine Ghanem, Karim Ouazzane, Vassil Vassilev. Reinforcement learning for an efficient and effective malware investigation during cyber incident response. High-Confidence Computing, 2025, 5(3): 100299 DOI:10.1016/j.hcc.2025.100299

登录浏览全文

4963

注册一个新账户 忘记密码

CRediT authorship contribution statement

Dipo Dunsin: Writing - review & editing, Writing - original draft, Validation, Project administration, Methodology, Investigation, Data curation, Conceptualization. Mohamed Chahine Ghanem: Supervision. Karim Ouazzane: Supervision. Vassil Vassilev: Supervision.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

[1]

T. Quertier, B. Marais, S. Morucci, B. Fournel, MERLIN-Malware Evasion with Reinforcement LearnINg, 2022, arXiv preprint arXiv:2203. 12980. Available at: https://arxiv.org/abs/2203.12980.

[2]

Ö.A. Aslan, R. Samet, A comprehensive review on malware detection approaches, IEEE Access 8 (2020) 6249-6271, https://ieeexplore.ieee.org/document/8949524.

[3]

M.S. Akhtar, T. Feng, Malware analysis and detection using machine learning algorithms, Symmetry 14 (11) (2022) 2304, Available at: http://dx.doi.org/10.3390/sym14112304.

[4]

D. Dunsin, M.C. Ghanem, K. Ouazzane, The use of artificial intelligence in digital forensics and incident response in a constrained environment, Int. J. Inf. Commun. Eng. 16 (8) (2022) 280-285.

[5]

Z. Fang, J. Wang, B. Li, S. Wu, Y. Zhou, H. Huang, Evading antimalware engines with deep reinforcement learning, IEEE Access 7 (2019) 48867-48879, Available at: https://ieeexplore.ieee.org/document/8676031.

[6]

C. Wu, J. Shi, Y. Yang, W. Li, Enhancing machine learning based malware detection model by reinforcement learning, in:Proceedings of the 8th International Conference on Communication and Network Security, 2018, pp. 74-78, https://dl.acm.org/doi/abs/10.1145/3290480.3290494.

[7]

A. Piplai, P. Ranade, A. Kotal, S. Mittal, S.N. Narayanan, A. Joshi,Using knowledge graphs and reinforcement learning for malware analysis, in: 2020 IEEE International Conference on Big Data (Big Data), IEEE, 2020, pp. 2626-2633, https://ieeexplore.ieee.org/document/9378491.

[8]

M.A. Farzaan, M.C. Ghanem, A. El-Hajjar, AI-enabled system for efficient and effective cyber incident detection and response in cloud environments, 2024, https://arxiv.org/abs/2404.05602.

[9]

M.C. Ghanem, P. Mulvihill, K. Ouazzane, R. Djemai, D. Dunsin,D2WFP: a novel protocol for forensically identifying, extracting, and analysing deep and dark web browsing activities, J. Cybersecur. Priv. 3 (4) (2023) 808-829, Available at: http://dx.doi.org/10.3390/jcp3040036.

[10]

Z. Fang, J. Wang, J. Geng, X. Kan, Feature selection for malware detection based on reinforcement learning, IEEE Access 7 (2019) 176177-176187, Available at: https://ieeexplore.ieee.org/document/8920059.

[11]

M.C. Ghanem, T.M. Chen, M.A. Ferrag, M.E. Kettouche, ESASCF: expertise extraction, generalization and reply framework for optimized automation of network security compliance, IEEE Access (2023) http://dx.doi.org/10.1109/ACCESS.2023.3332834.

[12]

L. Binxiang, Z. Gang, S. Ruoying,A deep reinforcement learning malware detection method based on PE feature distribution, in:2019 6th International Conference on Information Science and Control Engineering (ICISCE) (23-27), Shanghai, China, 2019, 2019, pp. 23-27, https://ieeexplore.ieee.org/document/9107644.

[13]

M. Ebrahimi, J. Pacheco, W. Li, J.L. Hu, H. Chen, Binary black-box attacks against static malware detectors with reinforcement learning in discrete action spaces, 2021, pp. 85-91, https://ieeexplore.ieee.org/document/9474314.

[14]

A.S. Basnet, M.C. Ghanem, D. Dunsin, W. Sowinski-Mydlarz, Advanced persistent threats (APT) attribution using deep reinforcement learning, 2024, arXiv preprint arXiv:2410.11463.

[15]

E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, C.K. Nicholas, Malware detection by eating a whole exe, in: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, http://dx.doi.org/10.13016/m2rt7w-bkok.

[16]

Y. Birman, S. Hindi, G. Katz, A. Shabtai, Cost-effective malware detection as a service over serverless cloud using deep reinforcement learning, in:2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), Melbourne, VIC, Australia, 2020, pp. 420-429, https://ieeexplore.ieee.org/document/9139646.

[17]

H.S. Anderson, A. Kharkar, B. Filar, D. Evans, P. Roth, Learning to evade static PE machine learning malware models via reinforcement learning, 2018, https://arxiv.org/abs/1801.08917.

[18]

W. Song, X. Li, S. Afroz, D. Garg, D. Kuznetsov, H. Yin, Mab-malware: A reinforcement learning framework for attacking static malware classifiers, 2020, https://arxiv.org/abs/2003.03100.

[19]

A. Rakhsha, G. Radanovic, R. Devidze, X. Zhu, A. Singla, Policy teaching in reinforcement learning via environment poisoning attacks, J. Mach. Learn. Res. 22 (1) (2021) 9567-9611, https://arxiv.org/abs/2003.12909.

[20]

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, 2013, ArXiv: Computer Vision and Pattern Recognition. http://export.arxiv.org/pdf/1312.6199.

[21]

D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis, Mastering the game of go with deep neural networks and tree search, Nature 529 (7587) (2016) 484-489, Available at: http://dx.doi.org/10.1038/nature16961.

[22]

D. Dunsin, M.C. Ghanem, K. Ouazzane, V. Vassilev, A comprehensive analysis of the role of artificial intelligence and machine learning in modern digital forensics and incident response. Forensic Science International, Digit. Investig. 48 (2024) 301675, http://dx.doi.org/10.1016/j.fsidi.2023.301675.

[23]

S.I. Gallant, Perceptron-based learning algorithms, IEEE Trans. Neural Netw. 1 (1990) 179-191, http://dx.doi.org/10.1109/72.80230.

[24]

A. Djenna, A. Bouridane, S. Rubab, I.M. Marou, Artificial intelligence-based malware detection, analysis, and mitigation, Symmetry 15 (3) (2023) 677, Available at: https://www.mdpi.com/2073-8994/15/3/677.

[25]

M.I. Malik, A. Ibrahim, P. Hannay, L.F. Sikos, Developing resilient cyber-physical systems: a review of state-of-the-art malware detection approaches, gaps, and future directions, Computers 12 (4) (2023) 79, Available at: https://www.mdpi.com/2073-431X/12/4/79.

[26]

M. Gopinath, S.C. Sethuraman, A comprehensive survey on deep learning based malware detection techniques, Comput. Sci. Rev. 47 (2023) 100529, Available at: https://www.sciencedirect.com/science/article/abs/pii/S1574013722000636.

[27]

R. Vinayakumar, M. Alazab, K.P. Soman, P. Poornachandran, S. Venkatraman, Robust intelligent malware detection using deep learning, IEEE Access 7 (2019) 46717-46738, Available at: https://ieeexplore.ieee.org/abstract/document/8681127.

[28]

U.E.H. Tayyab, F.B. Khan, M.H. Durad, A. Khan, Y.S. Lee, A survey of the recent trends in deep learning based malware detection, J. Cybersecur. Priv. 2 (4) (2022) 800-829, Available at: https://www.mdpi.com/2624-800X/2/4/41.

AI Summary AI Mindmap
PDF (3891KB)

535

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/