M-generalization for multipurpose transactional data publication

Xianxian LI, Peipei SUI, Yan BAI, Li-E WANG

PDF(575 KB)
PDF(575 KB)
Front. Comput. Sci. ›› 2018, Vol. 12 ›› Issue (6) : 1241-1254. DOI: 10.1007/s11704-016-6061-x
RESEARCH ARTICLE

M-generalization for multipurpose transactional data publication

Author information +
History +

Abstract

Transactional data collection and sharing currently face the challenge of how to prevent information leakage and protect data from privacy breaches while maintaining high-quality data utilities. Data anonymization methods such as perturbation, generalization, and suppression have been proposed for privacy protection. However, many of these methods incur excessive information loss and cannot satisfy multipurpose utility requirements. In this paper, we propose a multidimensional generalization method to provide multipurpose optimization when anonymizing transactional data in order to offer better data utility for different applications. Our methodology uses bipartite graphs with generalizing attribute, grouping item and perturbing outlier. Experiments on real-life datasets are performed and show that our solution considerably improves data utility compared to existing algorithms.

Keywords

anonymization / generalization / privacy protection / bipartite graph

Cite this article

Download citation ▾
Xianxian LI, Peipei SUI, Yan BAI, Li-E WANG. M-generalization for multipurpose transactional data publication. Front. Comput. Sci., 2018, 12(6): 1241‒1254 https://doi.org/10.1007/s11704-016-6061-x

References

[1]
Chang C C, Thompson B, Wang H W, Yao D. Towards publishing recommendation data with predictive anonymization. In: Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security. 2010, 24–35
CrossRef Google scholar
[2]
Zheng Z J, Kohavi R, Mason L. Real world performance of association rule algorithms. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001, 401–406
CrossRef Google scholar
[3]
Wang L E, Li X X. A hybrid optimization approach for anonymizing transactional data. In: Proceedings of International Conference on Algorithms and Architectures for Parallel Processing. 2015, 120–132
CrossRef Google scholar
[4]
Ghinita G, Tao Y F, Kalnis P. On the anonymization of sparse highdimensional data. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 715–724
[5]
Terrovitis M, Mamoulis N, Kalnis P. Privacy-preserving anonymization of set-valued data. Proceedings of the VLDB Endowment, 2008, 1(1): 115–125
CrossRef Google scholar
[6]
Terrovitis M, Mamoulis N, Kalnis P. Local and global recoding methods for anonymizing set-valued data. The VLDB Journal—The International Journal on Very Large Data Bases, 2011, 20(1): 83–106
[7]
He Y Y, Naughton J F. Anonymization of set-valued data via topdown, local generalization. Proceedings of the VLDB Endowment, 2009, 2(1): 934–945
CrossRef Google scholar
[8]
Liu J Q, Wang K. Anonymizing transaction data by integrating suppression and generalization. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2010, 171–180
CrossRef Google scholar
[9]
Xu Y B, Wang K, Fu A W C, Yu P S. Anonymizing transaction databases for publication. In: Proceedings of the 14th ACM SIGKDD Nternational Conference on Knowledge Discovery and Data Mining. 2008, 767–775
CrossRef Google scholar
[10]
Ghinita G, Kalnis P, Tao Y F. Anonymous publication of sensitive transactional data. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(2): 161–174
CrossRef Google scholar
[11]
Chen B, Kifer D, LeFevre K, Machanavajjhala A. Privacy-preserving data publishing. Foundations and Trends in Databases, 2009, 2(1–2): 1–167
[12]
Fung B C M, Wang K, Chen R, Yu P S. Privacy-preserving data publishing: a survey on recent developments. ACM Computing Surveys (CSUR), 2010, 42(4): 14
CrossRef Google scholar
[13]
Poulis G, Loukides G, Gkoulalas-Divanis A, Skiadoppoulos S. Anonymizing data with relational and transaction attributes. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2013, 353–369
CrossRef Google scholar
[14]
Takahashi T, Sobataka K, Takenouchi T, Toyoda Y, Mori T, Kohro T. Top-down itemset recoding for releasing private complex data. In: Proceedings of the 11th IEEE Annual International Conference on Privacy, Security and Trust. 2013, 373–376
CrossRef Google scholar
[15]
Gkoulalas-Divanis A, Loukides G. Utility-guided clustering-based transaction data anonymization. Transactions on Data Privacy, 2012, 5(1): 223–251
[16]
Cormode G, Srivastava D, Yu T, Zhang Q. Anonymizing bipartite graph data using safe groupings. The VLDB Journal—The International Journal on Very Large Data Bases, 2010, 19(1): 115–139
[17]
Wong W K, Mamoulis N, Cheung D W L. Non-homogeneous generalization in privacy preserving data publishing. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 2010, 747–758
CrossRef Google scholar
[18]
Samarati P. Protecting respondents’ identities in microdata release. IEEE transactions on Knowledge and Data Engineering, 2001, 13(6): 1010–1027
CrossRef Google scholar
[19]
Sweeney L. K-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2002, 10(05): 557–570
CrossRef Google scholar
[20]
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. ldiversity: privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 2007, 1(1): 3
CrossRef Google scholar
[21]
Li N H, Li T C, Venkatasubramanian S. T-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd IEEE International Conference on Data Engineering. 2007, 106–115
CrossRef Google scholar
[22]
Xue M Q, Karras P, Raïssi C, Vaidya J, Tan K L. Anonymizing setvalued data by nonreciprocal recoding. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1050–1058
CrossRef Google scholar
[23]
Cao J N, Karras P, Raïssi C, Tan K L. ρ-uncertainty: inference-proof transaction anonymization. Proceedings of the VLDB Endowment, 2010, 3(1–2): 1033–1044
CrossRef Google scholar
[24]
Loukides G, Gkoulalas-Divanis A, Shao J H. Anonymizing transaction data to eliminate sensitive inferences. In: Proceedings of International Conference on Database and Expert Systems Applications. 2010, 400–415
CrossRef Google scholar
[25]
Loukides G, Gkoulalas-Divanis A, Shao J H. Efficient and flexible anonymization of transaction data. Knowledge and Information Systems, 2013, 36(1): 153–210
CrossRef Google scholar
[26]
Zhou J, Jing J W, Xiang J, Wang L. Privacy preserving social network publication on bipartite graphs. In: Proceedings of IFIP International Workshop on Information Security Theory and Practice. 2012, 58–70
CrossRef Google scholar
[27]
Wang L E, Li X X. A clustering-based bipartite graph privacypreserving approach for sharing high-dimensional data. International Journal of Software Engineering and Knowledge Engineering, 2014, 24(07): 1091–1111
CrossRef Google scholar
[28]
Wang L E, Li X X. Personalized privacy protection for transactional data. In: Proceedings of International Conference on Advanced Data Mining and Applications. 2014, 253–266
CrossRef Google scholar
[29]
Loukides G, Gkoulalas-Divanis A, Malin B. COAT: constraint-based anonymization of transactions. Knowledge and Information Systems, 2011, 28(2): 251–282
CrossRef Google scholar
[30]
Gionis A, Mazza A, Tassa T. k-Anonymization revisited. In: Proceedings of the 24th IEEE International Conference on Data Engineering. 2008, 744–753
CrossRef Google scholar

RIGHTS & PERMISSIONS

2018 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(575 KB)

Accesses

Citations

Detail

Sections
Recommended

/