Cyberattack detection in wireless sensor networks using a hybrid feature reduction technique with AI and machine learning methods

Behiry, Mohamed H.; Aly, Mohammed

doi:10.1186/s40537-023-00870-w

Research
Open access
Published: 13 January 2024

Cyberattack detection in wireless sensor networks using a hybrid feature reduction technique with AI and machine learning methods

Journal of Big Data volume 11, Article number: 16 (2024) Cite this article

2610 Accesses
3 Citations
Metrics details

Abstract

This paper proposes an intelligent hybrid model that leverages machine learning and artificial intelligence to enhance the security of Wireless Sensor Networks (WSNs) by identifying and preventing cyberattacks. The study employs feature reduction techniques, including Singular Value Decomposition (SVD) and Principal Component Analysis (PCA), along with the K-means clustering model enhanced information gain (KMC-IG) for feature extraction. The Synthetic Minority Excessively Technique is introduced for data balancing, followed by intrusion detection systems and network traffic categorization. The research evaluates a deep learning-based feed-forward neural network algorithm's accuracy, precision, recall, and F-measure across three vital datasets: NSL-KDD, UNSW-NB 15, and CICIDS 2017, considering both full and reduced feature sets. Comparative analysis against benchmark machine learning approaches is also conducted. The proposed algorithm demonstrates exceptional performance, achieving high accuracy and reliability in intrusion detection for WSNs. The study outlines the system configuration and parameter settings, contributing to the advancement of WSN security.

Introduction

The usage of artificial intelligence AI for cyberattack detection in wireless sensor networks with a hybrid feature reduction technique involves developing a system that can effectively detect and classify cyberattacks in WSN environments. The system combines both machine learning and deep learning techniques to reduce the high-dimensional feature space while improving intrusion detection performance. This is achieved by utilizing a hybrid feature reduction technique that incorporates K-means clustering and entropy-based mutual information feature ranking to extract and rank the most relevant features. The system is then trained using a feed-forward deep neural network to accurately categorize network traffic. Overall, the aim is to provide early detection and learning systems with high-performance features for efficient cyberattack detection and prevention in WSN environments. The Wireless Sensor Network is being destroyed by cyberattacks (WSN. We developed WSN employing cyber-security technologies like machine learning in order to recognize and counter risks linked to WSN (ML). For artificial intelligence models, specialized cyber-security defense and protection solutions are needed. Information systems, Computers, networks, servers, and data must be protected of WSN-related threats with integrity, availability, and confidentiality as a minimum. Maintaining cyber security measures to safeguard sensitive information from online thieves. Virtual computers, cloud services, and network topologies are all protected by cybersecurity, which also helps to stop cybercrimes and aids in forensic investigations. Because the DNS server lacks adequate security, it requires outside protection to stop hackers from stealing its data. By implementing cyber security, this may be done to stop unauthorized access by cybercriminals.

The technique of protecting computer and mobile networks, software, servers, and electronic systems against viruses and malware is known as cybersecurity. Over 10 billion more records have been added to the menace of global cybercrime. In the US, NIST developed a framework for cyber security. Machine learning (ML), a subset of artificial intelligence, is used in cyber-security applications including prediction systems and the detection of zero-day attacks. The four types of machine learning (ML) methodologies are reinforcement, semi-supervised, unsupervised, and supervised. ML is designed for supply in consistent circumstances. Cyberattacks might therefore cause an unstable situation. A group of machine learning algorithms that go through several stages and are trained on various datasets may be thought of as deep learning (DL). In light of the growth of cybercrime, cybersecurity is detecting attacks in WSNs to safeguard shared and stored information and data. Many machine learning methods may render simulated attackers useless for SCADA and VANET intrusion detection systems. Concerns the use of machine learning's core and subcategories in cyber-security to identify malware, spam, rejection attacks, and biometric identification. By creating a brand-new dataset, ML methods utilising the MQTT protocol were recommended for categorizing attacks.

The goals of WSN security that we are going to discuss here are data secrecy, data availability, data authenticity and integrity, data freshness, self-organization, time synchronization, and secure localization.

Threats and attacks in WSN: Performer, objectives, and layer-wise features can be used to classify attackers.

I.
Attacks having a particular objective, which fall under either the active attack or passive attack categories.
II.
Performer-oriented attacks, which fall under either the inside attacks or outside attacks categories.
III.
Layer-oriented attacks, which target the data link, physical, transport, or network levels.

Motivated by the goals of WSN security, a deep feed forward neural network (DFFNN) model with k-means clustering (KMC) and information gain (IG) methods is proposed for attack with the main contributions are described below:

1.
The data is over-sampled and cleaned using the SMOTE-based ENN method, which also produces balanced data for further processing.
2.
Using the optimum features retrieved from the dataset, DLFFNN approach is proposed to evaluate the validity of the models.
3.
The KMC-IG approach, created to retrieve the best features from datasets including UNSW-NB15, NSL-KDD, and CICIDS2017.

In this work, three widely used datasets—NSL-KDD, UNSW-NB 15, and CICIDS 2017—are taken into consideration for evaluating the proposed work. For each dataset, the recommended approach's accuracy, precision, recall, and F-measure are evaluated under the full features and reduced features conditions. The outcomes of the proposed DFFNN-KMC-IG are also contrasted with those of benchmark machine learning methodologies. This approach incorporates deep learning and machine learning in three stages, including feature reduction, extraction of features, and categorization. These procedures are required to halt the reduction in resource availability caused by early attack detection.

The structure of this paper is organized as follows. Section "Hyperparameter tuning" focuses on the related work. Section "Preventing overfitting" knowledge and background which consists of four parts as follows: part 1 explains types of Cyber Attacks such as Malware Phishing, Man in the middle of the attack, SQL injection, and DNS tunnelling; part 2 includes few instances of cyberattacks within 2022 as Theft of Crypto.com, Breach of data at the Red Cross, and Cash app data breach. part three discusses significance of Cybersecurity, while fourth part contains the types of Cyber Security such Cloud security, Mobile security, Security with Zero Trust, Network security, Application security, IoT [1, 2], and End-point security. Section "Early stopping" focuses on Research Methodology including Proposed architecture workflow and algorithms which are "Data pre-processing stage" that includes Encoding Features Based on Labels, and Feature Normalization using Logarithmic technique, "Data splitting" stage, "Feature extraction and selection using KMC-IG-based FES", "Data balancing using SMOTE and ENN stage", "Training and validation stage" which explains DFNN and some Traditional machine learning (ML) Models. Section "Experiments and results" presents Experiments and Results which includes Datasets Description and Modelling, Binary Classification and Multi-class Classification with the Full and Reduced Feature Set, and comparisons with current related work. Sect. "Conclusion" is devoted to the conclusion of this study.

Related work

In their work, Kaur Saini et al. [3] conducted an evaluation of cyberattacks, while Chelli [4] investigated security issues and challenges in wireless sensor networks, including attacks and countermeasures. Daojing He et al. [5] focused on the cybersecurity defense of wireless sensor networks for smart grid monitoring. Padmavathi and Shanmugapriya [6] surveyed attacks in wireless sensor networks, covering security mechanisms and challenges. Al-Sakib Khan Pathan et al. [7] investigated security issues and challenges in wireless sensor networks, while Perrig et al. [8] discussed security in wireless sensor networks. Jian-hua Li [9] conducted a survey on the intersection of cybersecurity and artificial intelligence. Handa et al. [10] reviewed machine learning in cybersecurity, and Thomas et al. [11] investigated machine learning approaches for cybersecurity analytics. Gaganjot et al. [12] discussed secure cyber-physical systems for smart cities, while Boussi and Gupta [13, 14] developed a framework for combating cybercrime. Kumar [15] researched artificial intelligence-based approaches for intrusion detection. Shahnaz Saleem et al. [16] focused on network security threats in wireless body area networks, and Kalpana Sharma [17] outlined security issues in wireless sensor networks. Martins and Guyennet [18] provided a brief overview of wireless sensor network attacks and security procedures, while Anitha S. Sastry [19] examined security threats at every layer of wireless sensor networks. Kaplantzis [20] investigated security approaches for wireless sensor networks, and Chris and Wagner [21] explored secured routing and countermeasures. Yanli Yu et al. [22] investigated trust algorithms in wireless sensor networks, including hazard analysis. Xu et al. [23] explored the feasibility of launching and detecting jamming attacks in wireless networks, while Xu [24] investigated safeguarding wireless sensor networks from interference through channel surfing. Finally, Sohrabi [25] explored protocols for self-organizing wireless sensor networks. David and Scott [26] investigated Denial-of-Service attacks and defense of attacks and making Protections in Wireless Sensor Networks. Consolidated Detection of Node Replication Attacks in Sensor Networks was explored by Parno and Gligor [27]. A review of important management systems in wireless sensor networks was conducted by Xiao et al. [28]. Abhishek Jain et al. [29] investigated Wireless Sensor Network Cryptographic Protocols. Daniel E. Burgner is an American businessman. Luay Wahsheh [30] investigated Wireless Sensor Network Cybersecurity. Zhu et al. [31] investigated effective security solutions for large-scale wireless sensing networks. Culler and Hong [32] conducted research on Wireless Sensor Networks. Makhija et al. [33] used Machine Learning Techniques to classify attacks on MQTT-based IoT systems. Wang [34] explored an ensemble technique based on hybrid spectral segmentation in sensor networks. Zhang [35], on the other hand, used adversarial feature extraction to defend versus evasion assaults. Regarding some related works to the same datasets, we found that Tavallaee et al. [36] studied in details NSL-KDD dataset and the KDD CUP 99 data set. Sonule et al. [37] focused on UNSWNB15 Dataset and ML. Sharafaldin et al. [38] gave the attention toward generating a new intrusion detection dataset especially CICIDS2017 Dataset and intrusion traffic characterization. Aly and Alotaibi studied the modified gedunin using ML [39]. The referenced literature covers a broad spectrum of machine learning applications in security domains. Johri et al. [40] provide an overarching view of machine learning algorithms for intelligent systems, setting the stage for diverse applications. Rikabi and Hazim [41] propose an innovative fusion of encryption and steganography to enhance communication system security. Ahmad et al. [42] offer a comprehensive perspective on challenges in securing wireless sensor networks using machine learning. Ismail et al. [43] conduct a comparative analysis of machine learning models for cyber-attack detection in wireless sensor networks, while Khoei et al. [44] explore dynamic techniques against GPS spoofing attacks on UAVs. Karatas [45] focuses on refining machine learning-based intrusion detection systems, specifically addressing dataset challenges. Together, these studies underscore the vital role of machine learning in fortifying security measures across various technological domains, providing diverse strategies to tackle evolving threats.

In continuation of related works, regarding to traditional approaches to WSN Security.

traditional methods have laid the groundwork for securing Wireless Sensor Networks (WSNs). Cryptographic techniques, as discussed by Dong et al. [46], play a vital role in ensuring data confidentiality and integrity. Access control mechanisms, as explored by Zhang et al. [47], contribute to regulating network access, preventing unauthorized intrusions. While effective, traditional methods may face challenges in adapting to the dynamic nature of cyber threats.

Machine learning-based intrusion detection in WSNs

Machine learning (ML) techniques have been extensively explored for intrusion detection in WSNs. Recent studies, such as the work by Li et al. [48], utilize decision trees, support vector machines, and ensemble methods to leverage features extracted from network traffic data. Despite their effectiveness, ML-based methods may encounter challenges in adapting to new and evolving attack patterns.

Deep learning in WSN security

Deep learning techniques have gained attention for enhancing WSN security. Research by Wang et al. [49] explores the use of deep neural networks and attention mechanisms to capture intricate patterns in network data. Despite promising results, challenges related to interpretability and the need for substantial labeled data persist in deep learning approaches, as discussed by Chen et al. [50].

Clustering techniques for anomaly detection

Clustering algorithms, particularly K-means clustering, continue to be applied for anomaly detection in WSNs. The study by Kim et al. [51] demonstrates the use of clustering to group similar network behaviors, aiding in anomaly detection by identifying deviations from established norms. While effective, the dynamic nature of WSNs may influence the performance of clustering methods.

Feature reduction methods in WSN security

Feature reduction remains critical for enhancing the efficiency of intrusion detection systems. Recent studies, such as the work by Jingjing et al. [52], explore techniques like Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) for reducing the dimensionality of data. These methods contribute to the identification of key features associated with specific attack categories.

Comparative studies and benchmarking

Comparative studies, such as the one conducted by Zhao et al. [53], benchmark various intrusion detection approaches in WSNs. These studies assess the strengths and weaknesses of different methods in terms of accuracy, precision, recall, and F-measure. Benchmarking provides insights into the relative performance of different techniques, guiding the selection of optimal models for specific WSN scenarios.

Challenges and open issues

Challenges persist in WSN security, as highlighted by recent research. Adapting to dynamic network conditions, ensuring scalability, and addressing the limitations of existing approaches remain open issues. The trade-off between detection accuracy and resource consumption is a constant challenge, as discussed by Liu et al. [54].

Summary and positioning

In the dynamic landscape of WSN security, recent literature reflects a continuous evolution from traditional methods to sophisticated machine learning and deep learning approaches. The proposed Deep Forward Neural Network (DFNN) Classification Mode, as outlined in our study, seeks to address challenges observed in previous works by integrating feature reduction, clustering, and deep learning for robust intrusion detection and classification in WSNs.

This "Related works" section includes recent references and provides a detailed analysis of existing literature, establishing the context for the proposed DFNN Classification Mode in the rapidly advancing field of WSN security research.

The following topics have not previously been studied, which they represent the research gap in current related works:

It has not been investigated how to identify cyberattacks in wireless sensor networks using a hybrid feature reduction technique and machine learning.
DLFFNN methodology is not combined with the SMOTE-based ENN method.
While K-means Clustering-based Information Gain is utilized instead, the KMC-IG technique is not employed to extract the best features from datasets like UNSW-NB15, NSL-KDD, and CICIDS2017 (KMC-IG).

Knowledge and background

i. Types of cyber attacks

A cruel and unlawful attempt to steal priceless information and data from a specific person without that person's knowledge is known as a cyber-attack. Hackers are profiting off valued firms' sensitive data as cyberattacks rise every year. Cybercrime has cost more than 500,000 dollars over the last few years. The most typical forms of cyberattacks are as follows:

a)
Malware: The word "malware" is used to refer to unapproved programmes, applications, viruses, and worms. When a consumer hits the email links and message links and downloads unapproved programmes, malware software is installed. The virus can perform the following once it has been installed.
1. 1.
  Block internal security modules, for one.
2. 2.
  Introduce dangerous software into the system.
3. 3.
  Constant data transmission from the computer's hard disc.
b)
Phishing: Phishing is a generic term for the fraudulent activity of repeatedly sending emails from the same source with personal information in them. This kind is frequently used to get financial information, such as credit card information. The hacker infects computers and mobile devices with malware through the email link in order to steal crucial data.
c)
Man in the middle of the attack: The man-in-the-middle assault, commonly referred to as a bug attack, typically involves hackers who generate network traffic. After gaining access to the network, the hacker will implant a flaw in the system that will enable the hacker to access information from all of the victim's machines. When a user authenticates to public WiFi, the hacker exploits weaknesses in the network to generate traffic.
d)
SQL injection: When hackers insert code into the server that contains a virus or access control code, this is known as a structured query language (SQL) injection assault. The hacker gains access through this gateway when a victim runs the malicious code on their computer, allowing them to steal personal information.
e)
DNS tunnelling: DNS tunnelling delivers HTTP or another protocol via DNS in order to communicate with network-connected devices that are not linked to the DNS server protocol over a certain port number. Once connected, the hacker can use the DNS protocol to steal information online.

ii. Listed below are a few instances of cyberattacks within 2022.

1)
Theft of Crypto.com: This assault took place on January 17 and targeted the bitcoin wallets of 500 users. The hacker stole approximately 18 million dollars in bitcoins, 15 million dollars in Ethereum, and other cryptocurrencies.
2)
Breach of data at the Red Cross: The servers containing the personal data of almost 500,000 people who received assistance from the red-cross movement were attacked by hackers in January. The compromised server contains information about the company as well as the victims' personal and family information.
3)
Cash app data breach: Cash App acknowledged that a hacker with broad access to the business had gained access to the cash servers. In addition, this breach included hacking of client information, company data, account numbers, inventory data, portfolio values, and other confidential financial data.

iii. Significance of cyber security

Cybersecurity needs to be a top priority for every nation's military, government, commercial, private, medical, and financial organisations since they store a lot of data on servers, the cloud, and other gadgets. Overall, whether the data is sensitive or not, it can still pose issues for the business if intellectual, economic, financial, or any other type of data is open to illegal access or public inspection. There is a personal as well as an organisational future if the security of any application or website is poor. All firms are creating their own protection software to shield their sensitive data from security risks and assaults. Cybersecurity is crucial because it guards against viruses and malware and safeguards information as well as our computer systems. Cybercrimes are on the rise, and businesses and organisations, particularly those in the health, economic, and national safety sectors, need to take extra precautions to secure their data because the future of any nation depends on it. Every firm need cyber security to safeguard its critical data information from hackers. The nation's top intelligence officials issued a warning in April 2013 that cyberattacks and online surveillance posed a threat to national security concerns. Every person must be concerned about cyber security. We should maintain security while the system or files are connecting to the internet to prevent cybercrimes and decrease the chance of cyber-attacks.

iv. Types of cyber security

Various forms of cybersecurity exist, including Cloud security, Mobile Security, Zero trust, Network security, Application Security, IOT security, End-point security. Here the explanations of them are indicated as:

1)
Cloud security: Cloud computing is another name for cloud security. Many businesses nowadays are implementing cloud computing for their operations. A primary concern is ensuring cloud security. To safeguard the whole organization's cloud communications and architecture, cloud safety consists of solutions, policies, and services. A third-party solution is frequently provided by cloud security companies to safeguard an organization's cloud data.
2)
Mobile security: Malicious software, phishing scams, and instant messaging assaults must be prevented even on locked mobile phones, computers, and other tiny electronic devices. These hacks are stopped by mobile security systems, which also protect user data. When connected to the assets of the company, mobile device management (MDM) solutions will provide or guarantee access to the specific application.
3)
Security with zero trust: Zero-trust architecture is another name for zero-trust security (ZTA). The conventional security model places an emphasis on the perimeter and calls for the construction of fortified walls around the organization's most important assets. However, there are several severe problems with this strategy, including possible risks. A strategic approach to cyber security is zero-trust security, which aims to keep the validity of digital contact.
4)
Network security: Only in this area do attacks often occur. To stop hackers from hacking networks, there are words and programmes for network security. Data integrity and usability on personal and computer networks will be safeguarded. Among the strategies used to avoid data theft include information loss prevention (DLP), identification access management (IAM), and network access control (NAC), and next-generation firewall restrictions.
5)
Application security: Application security refers to security at the operating system. Due to their direct internet connection, web apps are vulnerable to data theft. Weaknesses in online applications such cross-site scripting, failed authentication, and injection. Unauthorized contact with apps and APIs is prevented by application security
6)
IoT: IoT security is a procedure used to protect IoT systems from dangers. The effectiveness of IoT devices boosts productivity in today's environment where the Internet of Things plays a significant role in all facets of the enterprise. Tools for Internet of Things security aid in defending against dangers and breaches. Device identification, device authentication, and data encryption can all help to safeguard IoT systems.
7)
End-point security: Remote computer access occurs in every company. Controlling an organization's end or entrance points, such as computers, laptops, and electrical controllers, is known as end-point security.

Research methodology

This study proposes using the K-means clustering model to improve information gain for feature reduction/extraction and ranking (KMC-IG). Additionally, a Synthetic Minority Over-sampling Technique is suggested. The final critical stage involves the classification of network traffic and intrusion protection systems. The network traffic feature datasets undergo several stages in succession, and for each dataset, the accuracy, precision, recall, and F-measure of the proposed approach are evaluated under the full features and reduced features scenarios. Furthermore, the performance of the proposed DFFNN-KMC-IG is compared to that of benchmark machine learning algorithms. By combining the strengths of DL and ML, the proposed hybrid model adapts the reduced attributes to improve their quality.

Wireless Sensor Networks intrusion detection systems (WSN-IDS) are crucial for ensuring the security of networked computer systems, but many WSN-IDS still struggle with efficiency. The feature space grows, the accuracy of existing ML-based WSN-IDS techniques effectively decreases. The feature extraction and optimization are performed using the K-means clustering with information gain approach proposed in this work.

In Fig. 1, we can extract features from packet capture using Network Traffic Data Packet (PCAP). The Pre-processing Step from Network Traffic Features Datasets can then be represented by Feature Representation using Label Encoding, or Feature Normalization using Logarithmic or Min–Max approaches. The Data Splitting Step then included the Training Set, Validation Set, and Testing Set. They all use KMC-IG for Feature Reduction and Selection to produce Training Set Reduced Features, Validation Set Reduced Features, and Testing Set Reduced Features. To Training Set Reduced Features, Data Balancing was implemented using SMOTE and ENN Stage. This implementation resulted in all Training and Validation Stages Developing and Training a suggested Deep Forward Neural Network (DFNN) Classification Model and Some Conventional Machine Learning (ML) Models, and this is the same result from Validation Set Reduced Features without balance. The following stage is the evaluation stage, which involves testing the trained DFNN model as well as other trained ML models. Confusion matrices, accuracy, F1-score, recall, and precision are all included in the classification Report. Lastly, the Comparisons Stage compares the acquired findings to some current relevant outcomes. Here, the proposed architecture workflow as in Fig. 1.

Each of these elements performs a crucial role and significantly affects the effectiveness of the WSN-IDS model. The design of the planned work for developing WSN-IDS is shown in Fig. 1.

Certainly! Let's delve into an overview of how the proposed Deep Forward Neural Network (DFNN) Classification Mode works, including details on the layers used in its architecture.

Proposed method overview:

1.
Input layer:
- The DFNN Classification Mode takes as input features extracted from network traffic data in the context of Wireless Sensor Networks (WSNs).
- Features could include information related to packet headers, traffic patterns, and other relevant attributes obtained from the monitored WSN.
2.
Feature reduction:
- The input features undergo a feature reduction process. This may involve techniques such as Singular Value Decomposition (SVD) and Principal Component Analysis (PCA), as suggested in the paper. The goal is to reduce the dimensionality of the feature space while retaining critical information.
3.
K-Means Clustering Model with Information Gain (KMC-IG):
- A K-Means Clustering Model enhanced with Information Gain (KMC-IG) is applied to further refine and cluster the reduced features. This step aims to identify patterns and group similar behaviors within the dataset.
4.
Synthetic minority excessively technique:
- The proposed Synthetic Minority Excessively Technique is introduced, likely during or after the clustering stage, to address imbalances in the dataset. This technique involves generating synthetic instances of minority class samples to balance the distribution.
5.
Deep Forward Neural Network (DFNN):
- The core of the proposed method is the Deep Forward Neural Network (DFNN). This neural network architecture is designed specifically for intrusion detection and classification in WSNs.
- The DFNN likely consists of multiple layers, including input, hidden, and output layers. The activation functions, such as ReLU (Rectified Linear Unit) or others, are applied between the layers to introduce non-linearity and capture complex relationships in the data.
6.
Evaluation metrics:
- The performance of the DFNN is evaluated using standard metrics such as accuracy, precision, recall, and F-measure. These metrics provide a comprehensive assessment of the model's ability to accurately classify instances, especially in the context of intrusion detection.

Hypothetical DFNN architecture:

Let's outline a hypothetical architecture for the DFNN:

Input layer: number of neurons equal to the number of features after feature reduction.
Hidden layers: multiple hidden layers with varying numbers of neurons. The architecture may include fully connected layers to capture intricate relationships.
Activation function: ReLU (Rectified Linear Unit) or another suitable non-linear activation function to introduce non-linearity.
Output layer: number of neurons equal to the number of classes (types of intrusions) in the dataset, typically using a softmax activation function for classification.
Loss function: cross-entropy loss, commonly used for classification tasks.
Optimization algorithm: Adam or another suitable optimization algorithm for updating weights during training.

Training process:

The DFNN is trained using the labeled dataset, considering both reduced features and clustering results.
Backpropagation is employed to update the weights of the network, optimizing its ability to classify instances accurately.
The model undergoes training iterations until convergence, minimizing the chosen loss function.

Evaluation:

The performance of the trained DFNN is evaluated on separate test datasets, considering both full and reduced feature sets.
Evaluation metrics such as accuracy, precision, recall, and F-measure are computed to assess the model's effectiveness in intrusion detection and classification.

Proposed architecture workflow and algorithms

Data pre-processing stage

It begins after datasets of network traffic features have been represented using Feature Representation using Label Encoding and Feature Normalization using Logarithmic technique.

The IDS model's detection abilities and efficiency can be improved by data preparation. According to the suggested paradigm, there are two main steps in data preprocessing:

Encoding features based on labels

Feature encoding is the process of converting non-numeric (symbol or text) attributes to numeric values. It is necessary to convert all symbolic qualities into numeric values since datasets used in intrusion detection frequently include discrete, symbolic, and continuous data. The two most common techniques are label encoding and one hot encoding. These pointer variables produced for each class have a substantial influence on the performance of deep learning algorithms due to the enormous dimensionality of the dataset. Scikit's learn-based label encoding is therefore employed. A normalization of features for the best processing, normalization maintains values in the same range.

Feature normalization using logarithmic technique

In this study, normalization is done in two steps. First, as mentioned in Eq. (1), the logarithmic standardization is carried out to bring all the characteristics into an acceptable range, and then the values are proportionately limited to the range [0,5] in Eq. (2).

$$f{r}_{norm}={\text{log}}\left(f{r}_{i}+1\right)$$

(1)

$$f{r}_{norm}=(j-k)\frac{f{r}_{i}-{\text{min}}(f{r}_{i})}{{\text{max}}\left(f{r}_{i}\right)-{\text{min}}(f{r}_{i})}$$

(2)

where j = 0 and k = 5.

Data splitting stage

The data splitting model comprises a Training Set, Validation Set, and Test Set, which are described in detail in this section, along with the feature set modeling using the KMC-IG feature reduction technique. When applied to a dataset, KMC-IG reduces the feature set, resulting in the selection of 39 CICIDS2017 features, 13 UNSW-NB15 features, and 16 NSL-KDD features. The accuracy of binary and multi-class classification is evaluated using both the entire and reduced datasets. Data modeling involves three steps, namely feature extraction and selection (FES), data balancing, and categorization, to reduce the high-dimensional feature space and enhance intrusion detection performance.

Feature extraction and selection (FES) using KMC-IG

To overcome the issue of duplication and redundancy when using the high dimensionality feature sets of NSL-KDD, UNSW-NB15, and CICIDS2017, a DLFFNN model is developed that utilizes clustering and the FES concept of entropy-based mutual information. The study recommends using the data mining-based K-means clustering method as the feature extractor to address this problem.

Reduced feature can occur in the Training Set, followed by Data Balancing using SMOTE and ENN, which leads to the Training and Validation Stage, where a proposed Deep Forward Neural Network (DFNN) Classification Model and some Traditional machine learning (ML) Models are built and trained.

KMC-IG-based FES

Utilizing K-means clustering, which groups datasets depending on the classification category, feature extraction is carried out. An entropy-based information gain feature ranking technique is employed to select each extracted feature following K-means clustering. The information gain (IG) feature ranking technique is to determine the scores or ranks of each feature for each cluster. High scores are chosen because they aid in increasing classification accuracy, while lower score rankings are disregarded. The following formula is used to compute each feature's IG with respect to each cluster category when ${\mathbbm{x}}$ and ${\mathbbm{y}}$ are two random variables:

$$IG\left(F{\mathbbm{x}}|F{\mathbbm{y}}\right)=E\left(F{\mathbbm{x}}\right)-CE\left(F{\mathbbm{x}}|F{\mathbbm{y}}\right)$$

(3)

$E\left({\mathbbm{x}}\right) {\text{and}} CE\left({\mathbbm{x}}|{\mathbbm{y}}\right)$ are the entropy with its condition for uncertainty measuring which they can be calculated from:

$$E\left(F{\mathbbm{x}}\right)=-{\sum }_{x\in F{\mathbbm{x}}}^{n}Prb\left(x\right){log}_{2}\left(x\right)$$

(4)

$$CE\left(F{\mathbbm{x}}|F{\mathbbm{y}}\right)=-{\sum }_{x\in F{\mathbbm{x}}}^{n}Prb\left(x\right){\sum }_{y\in F{\mathbbm{y}}}^{n}Prb\left(x\mid y\right){log}_{2}\left(Prb\left(x\mid y\right)\right)$$

(5)

$\left(F{\mathbbm{x}}|F{\mathbbm{y}}\right)=$ Where Prb is the probability of strong correlation based on information gain. Therefore, if $\left(F{\mathbbm{x}}|F{\mathbbm{y}}\right)$ > IG (F|$F{\mathbbm{y}}$) then the feature $F{\mathbbm{y}}$ that major related with $F{\mathbbm{x}}$ than F.

Data balancing using SMOTE and ENN stage

The classifier's performance is reinforced by the classification approach when dealing with imbalanced datasets such as NSL-KDD, CICIDS2017, and UNSWNB15. Under-sampling and over-sampling techniques for addressing the problem of imbalanced datasets. In the suggested model, SMOTE and ENN are utilized to make balancing the NSL-KDD, CICIDS2017, and UNSWNB15 datasets. Oversampling is accomplished using the SMOTE, and data cleaning and noise reduction with the ENN. SMOTE and ENN are used to balance data: SMOTE approach is used to M set on the minority instance in order to balance the dataset using SMOTE. The following formula generates n artificial instances for every $f{x}_{i}$ an instance of the M set:

$$f{x}_{syn}=f{x}_{ri}+f{x}_{i}\left(1-\eta \right)$$

(6)

where $f{x}_{ri}$ is an instance that randomly selected in neighbours to instances $f{x}_{i}$ and it can be computed by K-nearest neighbours (KNN) technique. $\eta$ is variable which take random values in the interval [0, 1]. If N is the total number of the instances such that every instance $f{x}_{i}\in N$ has higher various neighbours will be eliminated.

The following stages describe how the ENN operates:

1.
Calculate K nearest neighbours of $f{x}_{i}\in N$ using KNN.
2.
If the count of its closest neighbours is greater than other class, the instance $f{x}_{i}$ will be deleted.
3.
Continue this procedure until all instances of the majority class are subsets of N.

The feature set for the whole feature set is only encoded and normalized using Algorithm 1. After employing Algorithm 1 for a smaller feature set, Algorithm 2 is applied for feature extraction and selection. SMOTE and ENN are used to balance the dataset feature reduction on the minority instance.

ENN, or Edited Nearest Neighbors, is a method often employed for cleaning and reducing noise in datasets. In the context you've provided, ENN is used in conjunction with SMOTE (Synthetic Minority Over-sampling Technique) to address imbalances in datasets like NSL-KDD, CICIDS2017, and UNSWNB15.

Here's a step-by-step breakdown:

1.
Imbalanced datasets: The problem statement begins with the challenge of imbalanced datasets, where certain classes have significantly fewer instances than others.
2.
SMOTE for oversampling: SMOTE is introduced as a solution for oversampling the minority class. It generates synthetic instances in the feature space to balance the dataset, particularly focusing on the minority class.
3.
SMOTE applied to minority instances: The SMOTE approach is specifically used on the "M set" (likely referring to the minority set) to create synthetic instances and balance out the class distribution.
4.
ENN for data cleaning and noise reduction: ENN comes into play to clean the data and reduce noise. ENN works by examining instances and removing those that are misclassified by their nearest neighbors. This helps in refining the dataset and eliminating noisy samples.
5.
Utilizing SMOTE and ENN together: Both SMOTE and ENN are used in tandem to achieve a balanced and cleaned dataset. While SMOTE addresses the imbalance by creating synthetic instances, ENN steps in to improve the data quality by identifying and eliminating noisy samples.

This method involves using SMOTE to oversample the minority class and ENN to clean the dataset by removing instances that may introduce noise. The combination of these techniques aims to enhance the performance of a classifier when dealing with imbalanced datasets.

Training and validation stage

In this stage, building and training a proposed Deep Forward Neural Network (DFNN) Classification Model has been done besides Some Traditional machine learning (ML) Models.

Certainly! The use of a validation set in machine learning, including the proposed Deep Forward Neural Network (DFNN) Classification Mode, is crucial for several reasons. Here's a justification for the role of a validation set:

1.
Model generalization:
- The primary goal of any machine learning model, including neural networks, is to generalize well to unseen data. The validation set provides a means to assess how well the DFNN performs on data it hasn't encountered during training.
2.
Hyperparameter tuning:
- During the training process, hyperparameters like learning rate, batch size, or the number of hidden layers are optimized to enhance the model's performance. The validation set helps in tuning these hyperparameters by providing an independent dataset for evaluating different configurations.
3.
Preventing overfitting:
- Overfitting occurs when a model learns the training data too well, capturing noise and specificities that do not generalize. The validation set acts as a safeguard against overfitting by offering an unbiased evaluation of the model's performance on data it hasn't seen before.
4.
Early stopping:
- The validation set is often used in conjunction with early stopping. During training, if the performance on the validation set starts to degrade while training accuracy improves, it indicates potential overfitting. Early stopping prevents the model from becoming too specific to the training data.
5.
Model selection:
- In scenarios where multiple models or architectures are being considered, the validation set aids in comparing their performance. It helps in selecting the best-performing model before evaluating it on a separate test set.
6.
Avoiding data leakage:
- The validation set ensures that the model is not inadvertently learning patterns specific to the test set during training. This helps in avoiding data leakage, where the model's performance on the test set could be artificially inflated.
7.
Fine-tuning and iterative development:
- As the model evolves through iterative development, the validation set allows for fine-tuning. Adjustments to the model architecture or training process can be made based on the insights gained from validation set performance.
8.
Ensuring robustness:
- By evaluating the model on a validation set, researchers can gauge its robustness across different subsets of the data. This is especially important in situations where the dataset exhibits variability or heterogeneity.
9.
Building confidence in results:
- Including a validation set adds a level of rigor to the model evaluation process. It builds confidence in the reported performance metrics, as they are not solely based on the model's performance on the training data.

The validation set is an integral part of the machine learning pipeline. It serves as a critical tool for model selection, hyperparameter tuning, and ensuring that the trained model generalizes well to new, unseen data, which is essential for the reliable deployment of the proposed DFNN Classification Mode.

DFNN

Deep neural networks (DNNs) have emerged as the preferred technique for addressing complicated problems. A DNN is built on artificial neurons (AN), which are modelled after the biological neurons in the brain. The data totalled at the ANN's input is determined and sent. For each output, each DNN layer uses an activation function to increase learnability and approximation. This is completed to improve the model's ability to depict the non-linear nature of the real world. The activation function can take one of three forms: the hyperbolic tangent (tanh(x)), the rectified linear unit (ReLU), or the sigmoid (sig). The following formula represents each activation function's mathematical model:

$$\sigma_{{{\text{sig}}}} = \left( {{1} + {\text{e}}^{{ - {\text{x}}}} } \right)^{{ - {1}}}$$

(7)

$$Rf\left(x\right)=Max\left(0, x\right)$$

(8)

$${\text{tanh}}\left(x\right)=\frac{{e}^{2x}-1}{{e}^{2x}+1}$$

(9)

The DLFFNN is developed utilising back-propagation learning technique, and then the weights $(Wt)$ and biases are updated using the stochastic gradient descent (SDG) approach. Additionally, the difference between the desired and actual output is calculated to use the cost function, which is represented by the following expression:

$$Cost\left(Wt, bs;m,n\right)=0.5\parallel n-op\parallel^{2}$$

(10)

The Deep Forward Neural Network (DFNN) Classification Mode in the context of the paper.

1.
Objective: The primary goal is to enhance the security of a Wireless Sensor Network (WSN) by using a machine learning-based intelligent hybrid model and AI for identifying cyberattacks.
2.
Feature reduction: The paper suggests using a feature reduction algorithm, specifically Singular Value Decomposition (SVD) and Principal Component Analysis (PCA), to identify qualities closely associated with selected attack categories.
3.
K-Means Clustering Model with Information Gain (KMC-IG): The proposed approach involves the use of the K-means clustering model enhanced with information gain (KMC-IG) to reduce/extract features and rank them. This step aims to improve the efficiency of the subsequent classification process.
4.
Synthetic minority excessively technique: A Synthetic Minority Excessively Technique is introduced, likely for addressing imbalances in the dataset, ensuring better performance in handling minority class instances.
5.
Intrusion detection and network traffic categorization: The study evaluates the proposed deep learning-based feed-forward neural network algorithm for intrusion detection and classification. This includes the important stages of intrusion detection systems and network traffic categorization.
6.
Datasets and evaluation: Three key datasets, namely NSL-KDD, UNSW-NB 15, and CICIDS 2017, are considered. The algorithm's performance is assessed under two scenarios: full features and reduced features. Evaluation metrics include accuracy, precision, recall, and F-measure.
7.
Comparison with benchmark approaches: The proposed DLFFNN-KMC-IG is compared with benchmark machine learning approaches to demonstrate its effectiveness.
8.
Results: After dimensional reduction and balancing, the proposed algorithm achieves high accuracy, precision, recall, and F-measure for all three datasets. Notable results include 99.7% accuracy, 99.8% precision, 97.8% recall, and 98.8% F-measure for the NSL-KDD dataset in the reduced feature set.
9.
Hybrid system settings: The study outlines the settings for the proposed hybrid system with feature reduction for machine learning for attack classification and the parameters for the generic machine-learning model.
10.
Conclusion: The proposed intelligent hybrid cyber-security system is highlighted as crucial for recognizing and preventing related attacks in WSN environments. It effectively reduces features for classification using ML SVD and PCA, providing high-performance features for efficient early detection and learning systems.

In essence, the Deep Forward Neural Network (DFNN) Classification Mode integrates various techniques, including deep learning, clustering, and feature reduction, to achieve robust intrusion detection and classification in the context of Wireless Sensor Network security.

Evaluation stage

The evaluation stage focuses on testing the trained DFFNN model and other trained ML models, which includes an assessment of the proposed approach for binary and multi-class classification using three datasets of network traffic features. The effectiveness of IDS is crucial in addressing privacy and security concerns in WSNs. Furthermore, an IDS must have a low or zero percentage of false alarms in addition to detecting threats. Hence, the suggested model's performance is evaluated based on four important parameters, namely: Accuracy (ACY), Recall (RE), Precision (PRE), and F1-Score (FS) [39, 55, 56]. The strategy for evaluating the four metric parameters is represented by the following equations.

$$ACY \left(accuracy\right)=\frac{CN+CP}{CN+CP+IN+IP}$$

(13)

$$RE(Recall)=\frac{CP}{IN+CP}$$

(14)

$$PRE(precision)=\frac{CP}{IP+CP}$$

(15)

$$FS(F1-Score)=2\times \frac{RE\times PRE}{RE+PRE}$$

(16)

Where:

CN (Correct Negative): The instances that are truly negative and are correctly identified as negative.
CP (Correct Positive): The instances that are truly positive and are correctly identified as positive.
IN (Incorrect Negative): The instances that are truly positive but are incorrectly identified as negative.
IP (Incorrect Positive): The instances that are truly negative but are incorrectly identified as positive.

Experiments and results

Datasets description and modelling

In this research, a DFFNN model that combines clustering and the FES idea of entropy-based information gain is presented to overcome this issue. Three datasets are described in depth in this part, along with feature set modelling using the KMC-IG feature reduction technique. Each dataset's feature set is decreased once KMC-IG is applied. 39 features from CICIDS2017, 13 features from UNSW-NB15, and 16 features from NSL-KDD were chosen. Both the entire and the reduced datasets are used to assess the accuracy for binary and multi-class classifications.

Name of dataset	Attributes numbers and features
NSL-KDD	16 Features
CICIDS2017	39 Features
UNSW-NB 15	13 Features

The KDD99 dataset was developed based on the DARPA 1998 dataset and has become the most widely used dataset for IDSs. However, the presence of duplicate instances in this dataset can bias classification approaches towards normal examples and hinder their ability to detect anomalies. In contrast, the UNSW-NB15 dataset provides a diversified set of 49 feature properties, and includes nine different attack class forms such as DoS, R, and SC. The dataset is divided into different sections and consists of four CSV files containing 2,540,044 link entries. After splitting, setting, and removing six features, the dataset has 43 features remaining. Additionally, the CICIDS2017 dataset, released by Sharafaldin et al. in 2018, meets all 11 essential criteria for producing a trustworthy feature set, according to the Canadian Institute for Cybersecurity.This dataset, like the ISCX dataset, contains actual instances of both benign and harmful network traffic.

a) NSL-KDD dataset

The KDD99 dataset is widely regarded as the most popular dataset for IDSs, which makes it a benchmark for evaluating the performance of classification techniques.

Table 1 displays the NSL-KDD dataset's reduced feature set that was employed in this study.

Table 1 NSL-KDD reduced feature set

Cyberattack detection in wireless sensor networks using a hybrid feature reduction technique with AI and machine learning methods

Abstract

Introduction

Related work

Machine learning-based intrusion detection in WSNs

Deep learning in WSN security

Clustering techniques for anomaly detection

Feature reduction methods in WSN security

Comparative studies and benchmarking

Challenges and open issues

Summary and positioning

Knowledge and background

Research methodology

Proposed architecture workflow and algorithms

Data pre-processing stage

Encoding features based on labels

Feature normalization using logarithmic technique

Data splitting stage

Feature extraction and selection (FES) using KMC-IG

KMC-IG-based FES

Data balancing using SMOTE and ENN stage

Training and validation stage

DFNN

Evaluation stage

Experiments and results

Datasets description and modelling

Binary classification

Multi-class classification

Discussion with compassion

Graphical representations general results

Results and discussion

High detection accuracy across datasets

Effective feature reduction techniques

Balanced trade-off between precision and recall

Benchmarking and comparative analysis

Generalizability and adaptability

Efficiency and early detection

Practical implications

Future directions

Contributions to the field

Limitations and caveats

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords