Enhanced Intrusion Detection with Data Stream Classification and Concept Drift Guided by the Incremental Learning Genetic Programming Combiner

  1. Shyaa, Methaq A. 2
  2. Zainol, Zurinahni 2
  3. Abdullah, Rosni 2
  4. Anbar, Mohammed 1
  5. Alzubaidi, Laith 34
  6. Santamaría, José 5
  1. 1 National Advanced IPv6 Centre (NAv6), Universiti Sains Malaysia, USM, Gelugor 11800, Pulau Penang, Malaysia
  2. 2 School of Computer Sciences, Universiti Sains Malaysia, USM, Gelugor 11800, Pulau Penang, Malaysia
  3. 3 School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia
  4. 4 Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
  5. 5 Department of Computer Science, University of Jaén, 23071 Jaén, Spain
Revista:
Sensors

ISSN: 1424-8220

Año de publicación: 2023

Volumen: 23

Número: 7

Páginas: 3736

Tipo: Artículo

DOI: 10.3390/S23073736 GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: Sensors

Resumen

Concept drift (CD) in data streaming scenarios such as networking intrusion detection systems (IDS) refers to the change in the statistical distribution of the data over time. There are five principal variants related to CD: incremental, gradual, recurrent, sudden, and blip. Genetic programming combiner (GPC) classification is an effective core candidate for data stream classification for IDS. However, its basic structure relies on the usage of traditional static machine learning models that receive onetime training, limiting its ability to handle CD. To address this issue, we propose an extended variant of the GPC using three main components. First, we replace existing classifiers with alternatives: online sequential extreme learning machine (OSELM), feature adaptive OSELM (FA-OSELM), and knowledge preservation OSELM (KP-OSELM). Second, we add two new components to the GPC, specifically, a data balancing and a classifier update. Third, the coordination between the sub-models produces three novel variants of the GPC: GPC-KOS for KA-OSELM; GPC-FOS for FA-OSELM; and GPC-OS for OSELM. This article presents the first data stream-based classification framework that provides novel strategies for handling CD variants. The experimental results demonstrate that both GPC-KOS and GPC-FOS outperform the traditional GPC and other state-of-the-art methods, and the transfer learning and memory features contribute to the effective handling of most types of CD. Moreover, the application of our incremental variants on real-world datasets (KDD Cup ‘99, CICIDS-2017, CSE-CIC-IDS-2018, and ISCX ‘12) demonstrate improved performance (GPC-FOS in connection with CSE-CIC-IDS-2018 and CICIDS-2017; GPC-KOS in connection with ISCX2012 and KDD Cup ‘99), with maximum accuracy rates of 100% and 98% by GPC-KOS and GPC-FOS, respectively. Additionally, our GPC variants do not show superior performance in handling blip drift.

Referencias bibliográficas

  • Yazdi, (2020), Expert Syst. Appl., 162, pp. 113881, 10.1016/j.eswa.2020.113881
  • Jain, (2021), Clust. Comput., 24, pp. 2099, 10.1007/s10586-021-03249-9
  • Zhang, (2020), IEEE Access, 8, pp. 25210, 10.1109/ACCESS.2020.2970614
  • Mansour, (2021), CMC Comput. Mater. Contin., 68, pp. 2843
  • Neto, (2021), Inf. Sci., 561, pp. 81, 10.1016/j.ins.2021.01.051
  • Wang, (2018), IEEE Trans. Neural Netw. Learn. Syst., 29, pp. 4802, 10.1109/TNNLS.2017.2771290
  • Kalid, (2020), IEEE Access, 8, pp. 28210, 10.1109/ACCESS.2020.2972009
  • Sarnovsky, (2021), Peer J. Comput. Sci., 7, pp. e459, 10.7717/peerj-cs.459
  • Chi, (2022), IEEE Trans. Ind. Inform., 19, pp. 2065, 10.1109/TII.2022.3215231
  • Leng, (2022), J. Manuf. Syst., 65, pp. 279, 10.1016/j.jmsy.2022.09.017
  • Demir, (2019), Procedia Comput. Sci., 158, pp. 688, 10.1016/j.procs.2019.09.104
  • Angelopoulos, A., Michailidis, E.T., Nomikos, N., Trakadas, P., Hatziefremidis, A., Voliotis, S., and Zahariadis, T. (2019). Tackling faults in the industry 4.0 era—A survey of machine-learning solutions and key aspects. Sensors, 20.
  • Martindale, N., Ismail, M., and Talbert, D.A. (2020). Ensemble-based online machine learning algorithms for network intrusion detection systems using streaming data. Information, 11.
  • Adnan, A., Muhammed, A., Abd Ghani, A.A., Abdullah, A., and Hakim, F. (2021). An intrusion detection system for the internet of things based on machine learning: Review and challenges. Symmetry, 13.
  • Jain, (2022), Expert Syst. Appl., 193, pp. 116510, 10.1016/j.eswa.2022.116510
  • Folino, (2021), Inf. Fusion, 72, pp. 48, 10.1016/j.inffus.2021.02.007
  • Andresini, G., Pendlebury, F., Pierazzi, F., Loglisci, C., Appice, A., and Cavallaro, L. (2021, January 15). Insomnia: Towards concept-drift robustness in network intrusion detection. Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, Virtual.
  • Lu, (2019), IEEE Trans. Knowl. Data Eng., 31, pp. 2346
  • Guo, (2022), Inf. Sci., 585, pp. 1, 10.1016/j.ins.2021.11.023
  • Seth, S., Singh, G., and Chahal, K. (May, January 30). Drift-based approach for evolving data stream classification in Intrusion detection system. Proceedings of the Workshop on Computer Networks & Communications, Goa, India.
  • Liu, (2022), Knowl.-Based Syst., 238, pp. 107852, 10.1016/j.knosys.2021.107852
  • Zhou, (2020), Expert Syst. Appl., 162, pp. 113864, 10.1016/j.eswa.2020.113864
  • Han, (2014), Knowl.-Based Syst., 59, pp. 121, 10.1016/j.knosys.2014.01.015
  • Folino, (2020), Soft Comput., 24, pp. 17541, 10.1007/s00500-020-05200-3
  • Kuppa, (2022), Comput. Electr. Eng., 102, pp. 108239, 10.1016/j.compeleceng.2022.108239
  • Adnan, A., Muhammed, A., Abd Ghani, A.A., Abdullah, A., and Hakim, F. (2020). Hyper-heuristic framework for sequential semi-supervised classification based on core clustering. Symmetry, 12.
  • dos Santos, R.R., Viegas, E.K., Santin, A.O., and Cogo, V.V. (2022). Reinforcement learning for intrusion detection: More model longness and fewer updates. IEEE Trans. Netw. Serv. Manag.
  • Qiao, (2021), IEEE Trans. Ind. Inform., 18, pp. 3692, 10.1109/TII.2021.3108464
  • Yang, (2022), IEEE Trans. Ind. Inform., 19, pp. 2107, 10.1109/TII.2022.3212003
  • Wahab, (2022), IEEE Internet Things J., 9, pp. 19706, 10.1109/JIOT.2022.3167005
  • Mahdi, (2021), Clust. Comput., 24, pp. 2327, 10.1007/s10586-021-03267-7
  • Gâlmeanu, H., and Andonie, R. (2021). Concept Drift Adaptation with Incremental–Decremental SVM. Appl. Sci., 11.
  • Museba, (2021), Appl. Comput. Intell. Soft Comput., 2021, pp. 5533777
  • Wu, (2022), Int. J. Data Sci. Anal., 13, pp. 17, 10.1007/s41060-021-00273-1
  • Wu, O., Koh, Y.S., Dobbie, G., and Lacombe, T. (2021, January 18–22). Nacre: Proactive recurrent concept drift detection in data streams. Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China.
  • Chiu, (2020), IEEE Trans. Neural Netw. Learn. Syst., 33, pp. 1299, 10.1109/TNNLS.2020.3041684
  • Cano, (2020), Mach. Learn., 109, pp. 175, 10.1007/s10994-019-05840-z
  • Bakhshi, S., Ghahramanian, P., Bonab, H., and Can, F. (2021). A Broad Ensemble Learning System for Drifting Stream Classification. arXiv.
  • Yang, L., Manias, D.M., and Shami, A. (2021, January 7–11). PWPAE: An Ensemble Framework for Concept Drift Adaptation in IoT Data Streams. Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain.
  • Wang, (2022), Neurocomputing, 491, pp. 288, 10.1016/j.neucom.2022.03.038
  • Huang, (2005), Comput. Intell., 2005, pp. 232
  • Jiang, (2016), Neural Comput. Appl., 27, pp. 215, 10.1007/s00521-014-1714-x
  • Al-Khaleefa, A., Ahmad, M., Isa, A., Esa, M.R.M., Aljeroudi, Y., Jubair, M.A., and Malik, R.F. (2019). Knowledge preserving OSELM model for Wi-Fi-based indoor localization. Sensors, 19.
  • Ahmad, (2018), IEEE Access, 6, pp. 54769, 10.1109/ACCESS.2018.2870754
  • Tavallaee, M., Bagheri, E., Lu, W., and Ghorbani, A.A. (2009, January 8–10). A detailed analysis of the KDD CUP 99 data set. Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada.
  • Sharafaldin, (2018), ICISSp, 1, pp. 108
  • Shiravi, (2012), Comput. Secur., 31, pp. 357, 10.1016/j.cose.2011.12.012