Skip to main content

Table 3 Challenges of anomaly detection in context of high dimensionality problem

From: A comprehensive survey of anomaly detection techniques for high dimensional big data

Characteristic Features

Description

1. Relevant attribute identification

This refers to the difficulty of describing the relevant quantitative locality of data instances in the high-dimensional space

2. Distance concentration

Due to the sparsity of data, the datapoints become nearly equidistant in high dimensional space depending on the distance measure used [59, 97,98,99]

3. Subspace selection

The potential features of subspace increase exponentially in line with the increasing dimensionality of the input data, which results in an exponential search space

4. Hubness

The behavior of high-dimensional data containing data instances that are frequently appearing in nearest neighbors known as hubs