Skip to main content

Table 4 Existing De-identification preserving privacy measures and its limitations in big data

From: Big data privacy: a technological perspective and review

S.No

Privacy measure

Definitions

Limitations

Computational complexity

1

K-anonymity

It is a framework for constructing and evaluating algorithms and systems that release information such that released information limits what can be revealed about the properties of entities that are to be protected

Homogeneity-attack, background knowledge

O(k logk) [35, 73]

2

L-diversity

An equivalence class is said to have L-diversity if there are at least “well-represented” values for the sensitive attribute. A table is said to have L-diversity if every equivalence class of the table has L-diversity

L-diversity may be difficult and unnecessary to achieve and L-diversity is insufficient to prevent attribute disclosure

O((n2)/k)

3

T-closeness

An equivalence class is said to have T-closeness if the distance between the distribution of a sensitive attribute in this class and the distribution of the attribute in the whole table is no more than a threshold t. A table is said to have t-closeness if all equivalence classes have t-closeness

T-closeness requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of a sensitive attribute in the overall table

2O(n)O(m) [36]

  1. Presents existing De-identification preserving privacy measures and its limitations in big data along with their computational complexities