The stability of different aggregation techniques in ensemble feature selection

Journal of Big Data

Table 2 Details of aggregation techniques

Aggregation Technique	Formula	Description
Arithmetic Mean	\(\dfrac{a_1+a_2+\ldots +a_m}{m}\)	Calculates the average across importance scores and uses it to determine final aggregated score [36]
Geometric Mean	\(\root m \of { a_1 a_2 \ldots a_m}\)	Calculates the geometric average across importance scores and uses it to determine final aggregated score [36]
L2 Norm	\(\sqrt{a_1^2 +a_2^2 +\ldots +a_m^2}\)	Views the importance scores as an n-dimensional vector and calculates the Euclidean norm for that vector [36]
Stuart	\(\begin{array}{c} {\text {Pr}}[X \le \rho ]=\\ 1- {\text {Pr}}[\hat{r}_{(1)} \le 1-\beta _{m, m}^{-1}(\rho )\\ , \ldots , \hat{r}_{(m)} \le 1-\beta _{m, 1}^{-1}(\rho )] \end{array}\)	Compares obtained rank vectors to a baseline of randomly ranked features then assigns the features significance scores using the beta distribution [26]
RRA	\(\begin{array}{c} \rho (r)=\min _{k-1} \beta _{k, m}(r), \\ \beta _{k,m}(x):=\\ \sum _{\ell =k}^{m}\left( {\begin{array}{c}\ell \\ m\end{array}}\right) x^{\ell }(1-x)^{m-\ell } \end{array}\)	Similar to Stuart, but achieves efficiency & precision trade-off by using Bonferroni corrections [27]