From: A comprehensive survey of anomaly detection techniques for high dimensional big data
Characteristic Features | Description | |
---|---|---|
1. Relevant attribute identification | This refers to the difficulty of describing the relevant quantitative locality of data instances in the high-dimensional space | |
2. Distance concentration | Due to the sparsity of data, the datapoints become nearly equidistant in high dimensional space depending on the distance measure used [59, 97,98,99] | |
3. Subspace selection | The potential features of subspace increase exponentially in line with the increasing dimensionality of the input data, which results in an exponential search space | |
4. Hubness | The behavior of high-dimensional data containing data instances that are frequently appearing in nearest neighbors known as hubs |