A comprehensive survey of anomaly detection techniques for high dimensional big data

Journal of Big Data

Table 3 Challenges of anomaly detection in context of high dimensionality problem

Characteristic Features		Description
1. Relevant attribute identification		This refers to the difficulty of describing the relevant quantitative locality of data instances in the high-dimensional space
2. Distance concentration		Due to the sparsity of data, the datapoints become nearly equidistant in high dimensional space depending on the distance measure used [59, 97,98,99]
3. Subspace selection		The potential features of subspace increase exponentially in line with the increasing dimensionality of the input data, which results in an exponential search space
4. Hubness		The behavior of high-dimensional data containing data instances that are frequently appearing in nearest neighbors known as hubs