From: Scalable two-phase co-occurring sensitive pattern hiding using MapReduce
S. no | Approach | Technique used | Achieved | Issues |
---|---|---|---|---|
1 | [11] | Constructed lattice-like graph of a dataset Greedy iterative traversal to immediate subset Selected victim with maximum support | Good privacy level Simple and fast | Have not considered the extent of loss of support for large itemsets hence data quality get affected Scalability to handle large-scale data |
2 | [14] | Increase support of antecedent \((A) \to B\) Decrease support of consequent \(A \to (B)\) Hybrid decrement till confidence or support goes below the threshold | Decrease the support or confidence but not both | Based on strong assumption of any item contained in one sensitive itemset will not appear in another sensitive itemset Scalability |
3 | [15] | Deleted maximum support item \(i \in s\) from minimum length transaction The second algorithm sort sensitive itemset in terms of size and support of itemsets and mask them in round robin fashion | Sanitize minimum length transaction first in order to decrease the side effect on non-sensitive data Second algorithm is fair enough by masking in round robin fashion | Scalability is still an issue High execution time for large dataset |
4 | [9] | MaxFIA: deleted maximum support item \(i \in s\) where \(s \subseteq T\) MinFIA: Delete minimum support item \(i \in s\) where \(s \subseteq T\) IGA: make clusters of sensitive patterns sharing same itemsets and delete max or min support item | Cluster formation hides the set of sensitive itemsets at once No traversal required, easily count and select max and min support item Sensitive dataset is separated out in order to reduce the data size and sanitization time | Issue of scalability and high execution time in case of large scale dataset |
5 | [10] | SWA mask sensitive rules by hiding maximum frequency item \(i \in s\) where \(s \subseteq T\) SWA requires single database scan | Conceal all the sensitive rules Require single database scan Sliding window concept made the approach scalable to some extent | High execution time as well as scalability when data is big data |
6 | [12] | Aggregate: deleted transaction \(T\cap S\) supporting maximum sensitive itemsets Disaggregate: delete maximum support item \(i \in s\) where \(s \subseteq T\) from remaining transactions | Hybrid approach is fast as it selectively identify transaction and delete maximum support item | Direct deletion of transactions affects the data quality as the transactions may contain non-sensitive information too Scalability issue exists |