Skip to main content

Table 1 Summary of related work

From: Hybrid wrapper feature selection method based on genetic algorithm and extreme learning machine for intrusion detection

Reference

Objective

Feature selection method

Dataset

Advantage

Disadvantage

[6]

Solving the problem of the “nesting effect” found in the original SFS

Wrapper

KDD Cup 99 dataset

The high detection rate of anomaly intrusion with reduced features

Focused only on anomaly detection

[2]

Designing a new technique of binarizing binarize a continuous pigeon-inspired optimizer

Wrapper

KDDCUP 99, NLS-KDD and UNSW-NB15

The model had a better learning rate and outperformed other models in terms of TPR, FPR, accuracy, and F-score

The model was evaluated using outdated datasets

[7]

To develop IDS in a fog environment

Wrapper

NSL-KDD dataset

The excellent detection rate of 99.73%

Lower F-score compared to SVM, Random Forest, and Decision tree algorithms

[8]

Developing a model to diagnose different cancer diseases from big data

Filter-base + Wrapper

Four cancerous microarray datasets (Leukemia, ovarian cancer, small round blue cell tumor, and lung cancer datasets)

The model selected the few relevant genes with high accuracy

The model was tested using microarray datasets of smaller sizes

[9]

Proposed ensemble-filter-based hybrid feature selection model for disease detection

Filter-base + Wrapper

Twenty benchmark medical datasets

The model was evaluated using four classifiers namely, Naïve Bayes, Support Vector Machine with Radial Basis Function, Random Forest, and k-Nearest Neighbor

The study used only two performance metrics namely, accuracy and AUROC. The authors propose other metrics to be used in future work

[10]

Investigation of various feature selection techniques

Wrapper

Aegean Wi-Fi Intrusion Dataset (AWID)

The model reported a high detection accuracy of up to 99.95%

It takes longer to build the model

[11]

To develop semi-distributed and distributed IDS

Wrapper

AWID

Using a multi-layer perceptron (MLP) classifier, distributed IDS had the lowest CPU running time of 73.52 s and the best detection accuracy of 97.80%

The authors noted the need for up-to-date datasets for further evaluation of the model

[12]

To select the best features for exact classification of smart IoT anomaly and intrusion traffic identification

Wrapper

Bot-IoT dataset

The research reduced the 39 original features to 7 without affecting the model's accuracy

The research focused only on Bot-IoT attacks

[5]

To develop a feature selection technique based on a differential evaluation algorithm

Wrapper

NSL-KDD dataset

The selected features improved the accuracy and the running time of the model

The results are still not optimistic

[13]

To develop a wrapper-based feature selection method based on a modified whale optimization algorithm (WOA)

Wrapper

CICIDS2017 and ADFA-LD standard datasets

The improved WOA performed better compared with the traditional WOA in terms of detection rate and accuracy

Investigations can be done in the future to further reduce the features

[14]

To improve the performance of IDS through the development of a two-phase framework to increase the detection rate as well reducing the false alarm rate

Wrapper

NLS-KDD, ISCX2012, UNSW-NB15, KDD CUP 1999, and CICIDS2017 datasets

Introduction of a new metaheuristic algorithm (MOBBAT), a binary version of the BAT algorithm

The researchers did not consider computational cost as a metric measure

[1]

Selection of key features using an evolutionary algorithm

Filter and Wrapper

Wine, Ada Sonar, Sylva, Madelon and Gina datasets

Evaluation of the model was done using several datasets to drop any bias

The model was evaluated using only one metric measure

[15]

To develop a novel feature selection algorithm named hybrid improved dragonfly algorithm (HIDA)

Filter and Wrapper

10 gene expression datasets and 8 UCI data sets

HIDA has an excellent performance in resolving imbalanced classification problems

High computational complexity compared to the wrapper algorithm

[16]

Combination of genetic algorithms (GA) and particle swarm optimization (PSO) for best feature selection

Filter and Wrapper

Lung, Hill-Valley, Gas 6, Musk 1, Madelon, and Isolet 5

The model had a superior performance in feature reduction as well as classification accuracy

The authors did not compare the computational requirement of the model with other models

[17]

Implementation of binary version of the hybrid grey wolf optimization (GWO) and particle swarm optimization (PSO) for feature selection

Wrapper

18 standard benchmark datasets from UCI

The proposed model outperformed other models in accuracy, best features selection, and the computational time

The model used on one classification algorithm (KNN)

[18]

Implementation of a new feature selection named GAWA

Wrapper

Tweeter datasets

The technique had the capability of reducing the feature subsets up to 61.95% without affecting the accuracy level of the model

The model was evaluated using only one dataset, in the future as proposed by the others the model can be confirmed using other datasets

[19]

Implementation of novel feature selection based on GA and PI

Wrapper

UNSWNB 15 dataset

The proposed model yielded better performance in terms of accuracy and execution time

The main limitation of the model is that it focused on the detection of only two types of attacks

[20]

Combination of embedded and wrapper for feature selection

Embedded and Wrapper

NSL-KDD dataset

Use of two classification algorithms to test the selected feature subset

The model used the NSL-KDD dataset, a classical dataset that does not capture the current intrusion threats