From: Enhancing correlated big data privacy using differential privacy and machine learning
S. no. | Privacy measure | Definition | Limitations |
---|---|---|---|
1 | Pufferfish mechanism [21] | Formed strong foundation for other similar research | Could not satisfy differential privacy guarantee |
2 | Coupled behavior analysis (CBA) [25] | Good experimental results were obtained on real datasets | It does not furnish expected results for high dimensional data, other challenges of CBA are yet to be explored |
3 | CBA-HG and CBA-HC [31] | Experimental comparison showed that CBA-HG outperformed the mechanisms of [25] | Applicability on other datasets with different couplings is uncertain |
4 | Coupled item similarity (CIS) [26] | Proposed an effective mechanism to measure the non-IIDness | No solution to deal with the non-IIDness was proposed, without mentioning non-IIDness of data adversely affecting data privacy |
5 | Modified sensitivity calculation [27] | Multiplication of global sensitivity with the no. of correlated records for correlated datasets. | Data utility got highly degraded |
6 | Correlated sensitivity [12] | Noise reduced by an enormous amount and greater data utility as compared to [27] | Few parameters held trade-off with utility |
7 | Bayesian differential privacy [22] | Mechanism provided privacy for correlated data and against an adversary with partial background knowledge | Prior knowledge of probabilistic relationships is not possible |
8 | Dependent differential privacy [15] | High accuracy achieved | The estimation of value of \(\rho _{ij}\) is the key challenge |
9 | Pufferfish Wasserstein distance mechanism [6] | Mathematically proved the unnecessity to consider the correlation of distant nodes | When compared to the results of [17], it performed slightly worse for a particular range of values |
10 | Identity differential privacy [7] | Mechanism concluded that concepts of Information Theory are well suited to model the problems of dependent data in differential privacy | Practical implementation is not suggested as other privacy leakages were not studied |
11 | Bayesian network perturbation mechanism [8] | Proposed perturbation mechanism provided a decreased privacy budget and increased data utility | The requirement of modeling the Bayesian Network in advance may not be practically feasible |
12 | Statistical correlation analysis [9] | Enhanced accuracy by using correlation analysis techniques | The correlation analysis techniques and feature selection techniques used were not good enough to study complex relationships |
13 | Correlated differential privacy of big data publication [10] | Proposed use of Divide and conquer approach along with machine learning, Used correlated big datasets | Traditional correlation analysis technique used could not handle high dimensional data |
14 | Dependent differential privacy [13] | Proposed DDP and proved mathematically how it can be derived from DP | Lacks practical implementation |
15 | Temporal privacy leakage [14] | Temporal correlation along with the study of the relationship between data privacy and data utility | Other correlation models were not studied for temporal leakages |
16 | Weighted hierarchical graph mechanism [16] | Mechanism offers privacy guarantee in case of negative correlation as well | Not applicable to nonlinear queries |
17 | Temporal correlation mechanism [28] | Proposed w-event privacy using DP for location statistics and provided results regarding data utility | Correlation between other values was not studied |
18 | Bayesian DP with Gaussian correlation model [29] | Proposed Bayesian DP model which used Gaussian correlation model to study data correlation | Approximation of accurate probabilistic values is a challenge |
19 | Social network data privacy analysis [30] | Analysed data privacy leakages in social network data using single and cross social extrapolation | They did not provide any solution framework for the problem |