A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector

Journal of Big Data

Table 2 Overview of major feature extraction algorithms and their characteristics

Feature extraction algorithms	Algorithms	Characteristics	Benefits and limitations
Unsupervised Learning Approach	Principal Component Analysis [50]	Selects the most important genes and identifies transcriptional programs by extracting groups of genes that covary across a set of samples	Values taken by each variable do not all have the same importance and where the data may be contaminated with noise and contain outliers
Supervised Learning Approach	Independent Component Analysis (ICA) [51, 52]	New variables are confined in the rows of S, to wit, the variables observed are linearly collected independent components	Blind separation of independent sources from their linear combination
	Partial Least Square (PLS) [53]	It is determined by a small number of latent characteristics It goes for discovering uncorrelated linear transformation of the initial indicator characteristics which have high covariance with the reaction characteristics	Latent components, PLS predicts reaction characteristics y, the assignment of regression, and reproduce initial matrix X, the undertaking of data modelling To optimize the covariance among the variable y and the initial predictor variables