From: Big data privacy: a technological perspective and review
S.No | Research paper | Publication and year | Focus | Limitations |
---|---|---|---|---|
1 | “Toward Efficient and Privacy Preserving Computing in Big Data Era” [38] | IEEE Network July/Aug 2014 | Introduced an efficient and privacy-preserving cosine similarity computing protocol | Need significant research efforts for addressing unique Privacy issues in some specific big data analytics |
2 | “Hiding a needle in a Haystack: privacy preserving Apriori algorithm in map reduce framework” [46] | ACM Nov 7, 2014 | Proposed the privacy preserving data mining technique in Hadoop i.e. solve privacy violation without utility degradation | Execution time of proposed technique is affected by noise size |
3 | “Making big data, privacy, and anonymization work together in the enterprise: experiences and issues” [41] | IEEE International Congress 2014 | Discusses experiences and issues encountered when successfully combined anonymization, privacy protection, and Big Data techniques to analyse usage data while protecting the identities of users | Uses K-anonymity technique which is vulnerable to correlation attack |
4 | “Microsoft Differential Privacy for Everyone” [40] | Microsoft Research 2015 | Discussed and suggested how an existing approach “differential privacy” is suitable for big data | This method total depends on calculation of the amount of noise by the curator. So if curator is compromised the whole system fails |
5 | “A scalable two-phase top-down specialization approach for data anonymization using MapReduce on cloud” [69] | IEEE transactions on parallel and distributed systems 2014 | Proposed a scalable two-phase top-down specialization (TDS) approach to anonymize large-scale data sets using the Map Reduce framework on cloud | Uses anonymization technique which is vulnerable to correlation attack |
6 | “HireSome-II: towards privacy-aware cross-cloud service composition for big data applications” [74] | IEEE transactions on parallel and distributed systems 2014 | Proposed a privacy-aware cross-cloud service composition method, named HireSome-II (History record-based Service optimization method) based on its previous basic version HireSome-I | |
7 | Protection of big data privacy [7] | IEEE translations 2016 | Proposed various privacy issues dealing with big data applications | Customer segmentation and profiling can easily lead to discrimination based on age gender, ethnic background, health condition, social, background, and so on |
8 | Fast anonymization of big data streams [55] | ACM August, 2014 | Proposed an anonymization algorithm (FAST) to speed up anonymization of big data streams | Further research required to design and implement FAST in a distributed cloud-based framework in order to gain cloud computation power and achieve high scalability |
9 | Privacy preserving Ciphertext multi-sharing control for big data storage [75] | IEEE Transactions on informatics Forensics and Security 2015 | Proposed a privacy-preserving Ciphertext multi-sharing mechanism | The proxy can create delegation rights between the two parties which have never agreed upon the delegation process |
10 | Privacy-preserving machine learning algorithms for big data systems [76] | IEEE international conference on distributed computing systems 2015 | Proposed a novel framework to achieve privacy-preserving machine learning where the training data are distributed and each shared data portion of large volume | Not able to achieve distributed feature selection |
11 | Privacy-preserving big data publishing [50] | ACM June–July 2015 | Proposed approach towards privacy-preserving data mining of very massive data sets using MapReduce | Generalization is unable to handle high dimensional data, it reduces data utility. Perturbation reduces utility of data |
12 | Proximity-aware local-recoding anonymization with map reduce for scalable big data privacy preservation in cloud [70] | IEEE Transactions on computer August 2015 | Model the problem of big data local recoding against proximity privacy breaches as a proximity-aware clustering problem, and propose a scalable two-phase clustering approach accordingly | Further research to integrate our approach with Apache Mahout to achieve highly scalable privacy preserving big data mining or analytics |
13 | Deduplication on encrypted big data in cloud [77] | IEEE transactions on big data 2016 | Proposed a practical scheme to manage the encrypted big data in cloud with deduplication based on ownership challenge and Proxy Re-Encryption (PRE) | Convergent encryption(CE) is subject to an inherent security limitation, namely, susceptibility to offline Brute-force dictionary attacks |
14 | Security and privacy for storage and computation in cloud computing [22] | International Journal of Science and Research (IJSR) ISSN (Online): 2319–7064 | Proposed methodology provides data confidentiality, secure data sharing without Re-encryption, access control for malicious insiders, and forward and backward access control | Limiting the trust level in the cryptographic server (CS) |