PDF
Abstract
A large number of network security breaches in IoT networks have demonstrated the unreliability of current Network Intrusion Detection Systems (NIDSs). Consequently, network interruptions and loss of sensitive data have occurred, which led to an active research area for improving NIDS technologies. In an analysis of related works, it was observed that most researchers aim to obtain better classification results by using a set of untried combinations of Feature Reduction (FR) and Machine Learning (ML) techniques on NIDS datasets. However, these datasets are different in feature sets, attack types, and network design. Therefore, this paper aims to discover whether these techniques can be generalised across various datasets. Six ML models are utilised: a Deep Feed Forward (DFF), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Decision Tree (DT), Logistic Regression (LR), and Naive Bayes (NB). The accuracy of three Feature Extraction (FE) algorithms is detected; Principal Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis (LDA), are evaluated using three benchmark datasets: UNSW-NB15, ToN-IoT and CSE-CIC-IDS2018. Although PCA and AE algorithms have been widely used, the determination of their optimal number of extracted dimensions has been overlooked. The results indicate that no clear FE method or ML model can achieve the best scores for all datasets. The optimal number of extracted dimensions has been identified for each dataset, and LDA degrades the performance of the ML models on two datasets. The variance is used to analyse the extracted dimensions of LDA and PCA. Finally, this paper concludes that the choice of datasets significantly alters the performance of the applied techniques. We believe that a universal (benchmark) feature set is needed to facilitate further advancement and progress of research in this field.
Keywords
Feature extraction
/
Machine learning
/
Network intrusion detection system
/
IoT
Cite this article
Download citation ▾
Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, Marcus Gallagher, Marius Portmann.
Feature extraction for machine learning-based intrusion detection in IoT networks.
, 2024, 10(1): 205-216 DOI:10.1016/j.dcan.2022.08.012
| [1] |
I. Stellios, P. Kotzanikolaou, M. Psarakis, C. Alcaraz, J. Lopez, A survey of iot-enabled cyberattacks: assessing attack paths to critical infrastructures and services, IEEE,Commun. Surv. Tutorials 20 (4) (2018) 3453-3495.
|
| [2] |
N. Sultana, N. Chilamkurti, W. Peng, R. Alhadad, Survey on sdn based network intrusion detection system using machine learning approaches, Peer-to-Peer.Netw. Appl. 12 (2) (2019) 493-501.
|
| [3] |
M.A. Khan, K. Salah, Iot security: review, blockchain solutions, and open challenges, Future Generat. Comput. Syst. 82 (2018) 395-411.
|
| [4] |
M. Nawir, A. Amir, N. Yaakob, O.B. Lynn, Internet of things (iot): taxonomy of security attacks, in: 2016 3rd International Conference on Electronic Design (ICED), IEEE, 2016, pp. 321-326.
|
| [5] |
A. Pinto, Ot/iot security report: rising iot botnets and shifting ransomware escalate enterprise risk, URL, https://www.nozominetworks.com/blog/what-it-needs-to-know-about-ot-io-security-threats-in-2020/, 2020.
|
| [6] |
Symantec,Internet Security Threat Report, vol. 24, 2019. URL, https://docs.broadcom.com/doc/istr-24-2019-en.
|
| [7] |
S.F. Yusufovna,Integrating intrusion detection system and data mining, in:2008 International Symposium on Ubiquitous Multimedia Computing, 2008, pp. 256-259, https://doi.org/10.1109/UMC.2008.59.
|
| [8] |
P. García-Teodoro, J. Díaz-Verdejo, G. Macia-Fernandez, E. Vazquez, Anomaly-based network intrusion detection: techniques, systems and challenges, Comput. Secur. 28 (1-2) (2009) 18-28, https://doi.org/10.1016/j.cose.2008.08.003.
|
| [9] |
P.V. Amoli, T. Hamalainen, G. David, M. Zolotukhin, M. Mirzamohammad, Unsupervised network intrusion detection systems for zero-day fast-spreading attacks and botnets, JDCTA, Int. J. Digit. Contents.Technol.Appl. 10 (2) (2016) 1-13.
|
| [10] |
M.J. Hashemi, G. Cusack, E. Keller, Towards evaluation of nidss in adversarial setting, in: Proceedings of the 3rd ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks, 2019, pp. 14-21.
|
| [11] |
C. Sinclair, L. Pierce, S. Matzner,An application of machine learning to network intrusion detection, in:Proceedings 15th Annual Computer Security Applications Conference (ACSAC’99), IEEE, 1999, pp. 371-377.
|
| [12] |
A. Javaid, Q. Niyaz, W. Sun, M. Alam,A deep learning approach for network intrusion detection system, in:Proceedings of the 9th EAI International Conference on Bio-Inspired Information and Communications Technologies, formerly BIONETICS), 2016, pp. 21-26.
|
| [13] |
R. Sommer, V. Paxson, Outside the closed world: on using machine learning for network intrusion detection, in: 2010 IEEE Symposium on Security and Privacy, IEEE, 2010, pp. 305-316.
|
| [14] |
M. Azizjon, A. Jumabek, W. Kim,1d cnn based network intrusion detection with normalization on imbalanced data, 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)doi:10.1109/icaiic48513.2020.9064976.
|
| [15] |
S. Khan, E. Sivaraman, P.B. Honnavalli, Performance evaluation of advanced machine learning algorithms for network intrusion detection system, in: Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR, Chandigarh, India, 2020, pp. 51-59, https://doi.org/10.1007/978-981-15-3020-3_6.
|
| [16] |
X.A. Larriva-Novo, M. Vega-Barbas, V.A. Villagra, M. Sanz Rodrigo, Evaluation of cybersecurity data set characteristics for their applicability to neural networks algorithms detecting cybersecurity anomalies, IEEE Access 8 (2020) 9005-9014, https://doi.org/10.1109/access.2019.2963407.
|
| [17] |
A. Andalib, V.T. Vakili,A Novel Dimension Reduction Scheme for Intrusion Detection Systems in Iot Environments, 2020 05922 arXiv:2007.
|
| [18] |
W. Zong, Y.-W. Chow, W. Susilo, Dimensionality reduction and visualization of network intrusion detection data, Information Security and Privacy (2019) 441-455, https://doi.org/10.1007/978-3-030-21548-4_24.
|
| [19] |
W. Tao, W. Zhang, C. Hu, C. Hu, A Network Intrusion Detection Model Based on Convolutional Neural Network, Security with Intelligent Computing and Big-Data Services, 2019, pp. 771-783, https://doi.org/10.1007/978-3-030-16946-6_63.
|
| [20] |
M. Belouch, S. El Hadaj, M. Idhammad, Performance evaluation of intrusion detection based on machine learning using Apache spark, Procedia Comput. Sci. 127 (2018) 1-6, https://doi.org/10.1016/j.procs.2018.01.091.
|
| [21] |
M. A. Ferrag, L. Maglaras,H. Janicke, R. Smith, Deep Learning Techniques for Cyber Security Intrusion Detection : A Detailed Analysisdoi:10.14236/ewic/icscsr19.16.
|
| [22] |
H. Qiao, J. O. Blech, H. Chen, A machine learning based intrusion detection approach for industrial networks, 2020 IEEE International Conference on Industrial Technology (ICIT)doi:10.1109/icit45562.2020.9067253.
|
| [23] |
R. Sommer, V. Paxson, Outside the closed world: on using machine learning for network intrusion detection, 2010 IEEE Symposium on Security and Privacydoi: 10.1109/sp. 2010.25.
|
| [24] |
A. Fernandez, B. Krawczyk, S. Garcia, M. Galar, F. Herrera, R.C. Prati, Learning from Imbalanced Data Sets, first ed., Springer, 2018.
|
| [25] |
X. Guo, Y. Yin, C. Dong, G. Yang, G. Zhou,On the class imbalance problem, 2008 Fourth International Conference on Natural Computationdoi: 10.1109/ icnc. 2008.871.
|
| [26] |
T.K. Ho,Random decision forests, in:Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, IEEE, 1995, pp. 278-282.
|
| [27] |
N. Moustafa, J. Slay, Unsw-nb15:a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set), 2015 Military Communications and Information Systems Conference (MilCIS)doi:10.1109/milcis.2015.7348942.
|
| [28] |
N. Moustafa, Ton-iot Datasets, 2019, https://doi.org/10.21227/fesz-dm97. URL.
|
| [29] |
I. Sharafaldin, A. Habibi Lashkari, A.A. Ghorbani, Toward generating a new intrusion detection dataset and intrusion traffic characterization, Proceedings of the 4th International Conference on Information Systems Security and Privacy, 10.5220/0006639801080116. URL, https://registry.opendata.aws/cse-cic-ids2018/.
|
| [30] |
X. Li, W. Chen, Q. Zhang, L. Wu, Building auto-encoder intrusion detection system based on random forest feature selection, Comput. Secur. 95 (2020) 101851, https://doi.org/10.1016/j.cose.2020.101851.
|