Skip to main content

Table 3 Challenges of anomaly detection in context of high dimensionality problem

From: A comprehensive survey of anomaly detection techniques for high dimensional big data

Characteristic Features Description
1. Relevant attribute identification This refers to the difficulty of describing the relevant quantitative locality of data instances in the high-dimensional space
2. Distance concentration Due to the sparsity of data, the datapoints become nearly equidistant in high dimensional space depending on the distance measure used [59, 97,98,99]
3. Subspace selection The potential features of subspace increase exponentially in line with the increasing dimensionality of the input data, which results in an exponential search space
4. Hubness The behavior of high-dimensional data containing data instances that are frequently appearing in nearest neighbors known as hubs