Skip to main content

Table 1 Overview of major feature selection algorithm approaches and their characteristics

From: A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector

Feature selection method

Algorithms

Characteristics

Benefits and limitations

Assessments

Filter Based approaches

Correlation-based feature selection (CBFS)

evaluates a subset by considering the predictive ability of each one of its features individually and also their degree of redundancy (or correlation)

It is feature dependent but slower than univariate techniques

heuristic merit

 

Mutual Information

Examined most probable cancer associated genes, to enhance classification accuracy

Evaluates dependencies of features and classes

Features contributes redundancy to classification [43]

Symmetric relationship

 

Analysis of Variance (ANOVA) [44]

The dependent variable is continuous and categorized as nominal or ordinal. Its data are normally distributed

It gives overall test of equality of group means

It tests against specific hypothesis

Hypothesis test

 

Information Gain [45]

It measures known features of a certain relevant and predicted

Information, features that frequently occur in positive samples can be obtained

Its evaluation method based on entropy and it involves lots of mathematical theories and complex theories and formulas about entropy

Ranking

 

Chi-Square [46]

evaluates the correlation between two variables and determines whether they are independent or correlated

  

Wrapper Based Approaches

Genetic Algorithm [43]

It mimics evolution by taking population of strings to encode possible solutions and combines them to produce more fit

Produces random population search

But has lower training time

Crossover and mutation

 

Recursive feature

elimination method [47]

Backward selection of predictors that fits models and removes weakest features

Has an essential partitioning predictor

Ranks features based on the order of their elimination and multicollinearity

Greedy optimization

Embedded approaches

Info Gain-SVM [48]

Selects attributes and improves correlation

Reduces the effect of bias resulting from information gain. Adjusts each attribute to allow for the breadth and uniformity of the attribute values

Wavelength

 

SVM-RFE [49]

makes implicit orthogonality assumptions, it considers a combination of univariate classifiers

The decision function is based only on support vectors that are “borderline” cases as opposed to being based on all examples in an attempt to characterize the “typical” cases

lower risk of overfitting

ranking criterion