Skip to main content

Table 3 Proposed solutions

From: Machine learning concepts for correlated Big Data privacy

S. no

Privacy measure




Pufferfish Mechanism [40]

Formed strong foundation for other similar research

Could not satisfy differential privacy guarantee


Coupled Behaviour Analysis (CBA) [35]

Good experimenatl results were obtained on real datasets

It does not furnish expected results for high dimensional data, Other challenges of CBA are yet to be explored


CBA-HG and CBA-HC [36]

Experimental comparison showed that CBA-HG outperformed the mechanisms of [35]

Applicability on other datasets with different couplings is uncertain


Coupled Item Similarity (CIS) [39]

Proposed an effective mechanism to measure the non-IIDness

No solution to deal with the non-IIDness was proposed, without mentioning non-IIDness of data adversely affecting data privacy


Modified Sensitivity Calculation [8]

Multiplication of global sensitivity with the no. of correlated records for correlated datasets

Data utility highly degraded attribute


Correlated Sensitivity [9]

Noise reduced by an enormous amount and greater data utility as compared to [8]

Few parameters held trade-off with utility


Bayesian Differential Privacy [41]

Mechanism provided privacy for correlated data and against an adversary with partial background knowledge

Prior knowledge of probabilistic relationships is not possible


Dependent Differential Privacy [10]

High accuracy Achieved

The estimation of value of \(\rho _{ij}\) is the key challenge


Pufferfish Wasserstein Distance Mechanism [6]

Mathematically proved the unnecessity to consider the correlation of distant nodes

When compared to the results of [44], it performed slightly worse for a particular range of values


Identity Differential Privacy [12]

Mechanism concluded that concepts of Information Theory are well suited to model the problems of dependent data in Differential Privacy

Practical implementation is not suggested as other privacy leakages were not studied


Bayesian Network Perturbation Mechanism [7]

Proposed perturbation mechanism provided a decreased privacy budget and increased data utility

The requirement of modeling the Bayesian Network in advance may not be practically feasible


Statistical Correlation Analysis [42]

Enhanced accuracy by using correlation analysis techniques

The correlation analysis techniques and feature selection techniques used were not good enough to study complex relationships


Correlated differential privacy of big data publication [13]

Proposed use of Divide and conquer approach along with machine learning, Used correlated big datasets

Traditional correlation analysis technique used could not handle high dimensional data


Dependent Differential Privacy [10]

Proposed DDP and proved mathematically how it can be derived from DP

Lacks practical implementation


They study Temporal Privacy Leakage [43]

Temporal correlation along with the study of the relationship between data privacy and data utility

Other correlation models were not studied for temporal leakages


Weighted Hierarchical Graph Mechanism [14]

Mechanism offers privacy guarantee in case of negative correlation as well

Not applicable to nonlinear queries


Temporal Correlation Mechanism [45]

Proposed w-event privacy using DP for location statistics and provided results regarding data utility

Correlation between other values was not studied


Bayesian DP with Gaussian Correlation Model [49]

Proposed Bayesian DP model which used Gaussian Correlation Model to study data correlation

Approximation of accurate probabilistic values is a challenge