Big data privacy: a technological perspective and review

Journal of Big Data

Table 4 Existing De-identification preserving privacy measures and its limitations in big data

S.No	Privacy measure	Definitions	Limitations	Computational complexity
1	K-anonymity	It is a framework for constructing and evaluating algorithms and systems that release information such that released information limits what can be revealed about the properties of entities that are to be protected	Homogeneity-attack, background knowledge	O(k logk) [35, 73]
2	L-diversity	An equivalence class is said to have L-diversity if there are at least “well-represented” values for the sensitive attribute. A table is said to have L-diversity if every equivalence class of the table has L-diversity	L-diversity may be difficult and unnecessary to achieve and L-diversity is insufficient to prevent attribute disclosure	O((n²)/k)
3	T-closeness	An equivalence class is said to have T-closeness if the distance between the distribution of a sensitive attribute in this class and the distribution of the attribute in the whole table is no more than a threshold t. A table is said to have t-closeness if all equivalence classes have t-closeness	T-closeness requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of a sensitive attribute in the overall table	2^O(n)O(m) [36]

Presents existing De-identification preserving privacy measures and its limitations in big data along with their computational complexities