Skip to main content

Table 3 Proposed solutions

From: Machine learning concepts for correlated Big Data privacy

S. no Privacy measure Definition Limitations
1 Pufferfish Mechanism [40] Formed strong foundation for other similar research Could not satisfy differential privacy guarantee
2 Coupled Behaviour Analysis (CBA) [35] Good experimenatl results were obtained on real datasets It does not furnish expected results for high dimensional data, Other challenges of CBA are yet to be explored
3 CBA-HG and CBA-HC [36] Experimental comparison showed that CBA-HG outperformed the mechanisms of [35] Applicability on other datasets with different couplings is uncertain
4 Coupled Item Similarity (CIS) [39] Proposed an effective mechanism to measure the non-IIDness No solution to deal with the non-IIDness was proposed, without mentioning non-IIDness of data adversely affecting data privacy
5 Modified Sensitivity Calculation [8] Multiplication of global sensitivity with the no. of correlated records for correlated datasets Data utility highly degraded attribute
6 Correlated Sensitivity [9] Noise reduced by an enormous amount and greater data utility as compared to [8] Few parameters held trade-off with utility
7 Bayesian Differential Privacy [41] Mechanism provided privacy for correlated data and against an adversary with partial background knowledge Prior knowledge of probabilistic relationships is not possible
8 Dependent Differential Privacy [10] High accuracy Achieved The estimation of value of \(\rho _{ij}\) is the key challenge
9 Pufferfish Wasserstein Distance Mechanism [6] Mathematically proved the unnecessity to consider the correlation of distant nodes When compared to the results of [44], it performed slightly worse for a particular range of values
10 Identity Differential Privacy [12] Mechanism concluded that concepts of Information Theory are well suited to model the problems of dependent data in Differential Privacy Practical implementation is not suggested as other privacy leakages were not studied
11 Bayesian Network Perturbation Mechanism [7] Proposed perturbation mechanism provided a decreased privacy budget and increased data utility The requirement of modeling the Bayesian Network in advance may not be practically feasible
12 Statistical Correlation Analysis [42] Enhanced accuracy by using correlation analysis techniques The correlation analysis techniques and feature selection techniques used were not good enough to study complex relationships
13 Correlated differential privacy of big data publication [13] Proposed use of Divide and conquer approach along with machine learning, Used correlated big datasets Traditional correlation analysis technique used could not handle high dimensional data
14 Dependent Differential Privacy [10] Proposed DDP and proved mathematically how it can be derived from DP Lacks practical implementation
15 They study Temporal Privacy Leakage [43] Temporal correlation along with the study of the relationship between data privacy and data utility Other correlation models were not studied for temporal leakages
16 Weighted Hierarchical Graph Mechanism [14] Mechanism offers privacy guarantee in case of negative correlation as well Not applicable to nonlinear queries
17 Temporal Correlation Mechanism [45] Proposed w-event privacy using DP for location statistics and provided results regarding data utility Correlation between other values was not studied
18 Bayesian DP with Gaussian Correlation Model [49] Proposed Bayesian DP model which used Gaussian Correlation Model to study data correlation Approximation of accurate probabilistic values is a challenge