Differential privacy histogram publishing method based on dynamic sliding window
Qian CHEN, Zhiwei NI, Xuhui ZHU, Pingfan XIA
Differential privacy histogram publishing method based on dynamic sliding window
Differential privacy has recently become a widely recognized strict privacy protection model of data release. Differential privacy histogram publishing can directly show the statistical data distribution under the premise of ensuring user privacy for data query, sharing, and analysis. The dynamic data release is a study with a wide range of current industry needs. However, the amount of data varies considerably over different periods. Unreasonable data processing will result in the risk of users’ information leakage and unavailability of the data. Therefore, we designed a differential privacy histogram publishing method based on the dynamic sliding window of LSTM (DPHP-DL), which can improve data availability on the premise of guaranteeing data privacy. DPHP-DL is integrated by DSW-LSTM and DPHK+. DSW-LSTM updates the size of sliding windows based on data value prediction via long short-term memory (LSTM) networks, which evenly divides the data stream into several windows. DPHK+ heuristically publishes non-isometric histograms based on k-mean++ clustering of automatically obtaining the optimal , so as to achieve differential privacy histogram publishing of dynamic data. Extensive experiments on real-world dynamic datasets demonstrate the superior performance of the DPHP-DL.
differential privacy / dynamic data / histogram publishing / sliding window
Qian Chen received the MS degree in computer science and technology from Anhui University, China in 2019. He is currently a PhD candidate in management science and engineering at Hefei University of Technology, China. His research interests include information security, artificial intelligence, and cloud computing
Zhiwei Ni received BE and MS degrees in computer software and theory from Anhui University, China. In June 2002, he completed his PhD degree in University of Science and Technology of China, China. Since 2002, he has become a Professor and PhD supervisor in Hefei University of Technology, China, where he has presided over or participated in more than 20 national and provincial projects. From 2010 to 2021, he has served as the director of the Intelligent Management Institute in Hefei University of Technology, China where he has authored three books and more than 100 articles. His research interests include artificial intelligence, machine learning, and cloud computing
Xuhui Zhu received the BE degree from the School of Mathematics of Hefei University of Technology, China and the PhD degree in management science and engineering from Hefei University of Technology, China. He is currently a Lecturer in Hefei University of Technology, China. His research interests include evolution computation and machine learning
Pingfan Xia received the MS degree in financial engineering from Anhui University of Finance and Economics, China. She is currently a PhD candidate in management science and engineering at Hefei University of Technology, China. Her research interests include machine learning and internet finance
[1] |
Dwork C. Differential privacy: a survey of results. In: Proceedings of the 5th International Conference on Theory and Applications of Models of Computation. 2008, 1–19
|
[2] |
Zhu T, Li G, Zhou W, Yu P S . Differentially private data publishing and analysis: a survey. IEEE Transactions on Knowledge and Data Engineering, 2017, 29( 8): 1619–1638
|
[3] |
Chan T H H, Shi E, Song D. Private and continual release of statistics. In: Proceedings of the 37th International Colloquium on Automata, Languages, and Programming. 2010, 405–417
|
[4] |
Acs G, Castelluccia C, Chen R. Differentially private histogram publishing through lossy compression. In: Proceedings of the 12th International Conference on Data Mining. 2012, 1–10
|
[5] |
Dwork C, Naor M, Pitassi T, Rothblum G N. Differential privacy under continual observation. In: Proceedings of the 42nd ACM Symposium on Theory of Computing. 2010, 715–724
|
[6] |
Fang C, Chang E C. Differential privacy with δ-neighbourhood for spatial and dynamic datasets. In: Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security. 2014, 159–170
|
[7] |
Xu J, Zhang Z, Xiao X, Yang Y, Yu G. Differentially private histogram publication. In: Proceedings of the 28th IEEE International Conference on Data Engineering. 2012, 32–43
|
[8] |
Aissaoui M. Proportional differential privacy (PDP): a new approach for differentially private histogram release based on buckets densities. In: Proceedings of the 9th IFIP International Conference on Performance Evaluation and Modeling in Wireless Networks. 2020, 1–7
|
[9] |
Zhang X J, Meng X F . Streaming histogram publication method with differential privacy. Journal of Software, 2016, 27( 2): 381–393
|
[10] |
Yang L, Zheng X, Zhao W . Non-equal-width histogram publishing method based on differential privacy. Chinese Journal of Network and Information Security, 2020, 6( 3): 39–49
|
[11] |
Yan F, Zhang X, Li C, Li W, Li S, Sun F. Differentially private histogram publishing through fractal dimension for dynamic datasets. In: Proceedings of the 13th IEEE Conference on Industrial Electronics and Applications. 2018, 1542–1546
|
[12] |
Zheng Z, Wang T, Wen J, Mumtaz S, Bashir A K, Chauhdary S H . Differentially private high-dimensional data publication in internet of things. IEEE Internet of Things Journal, 2020, 7( 4): 2640–2650
|
[13] |
Wang N, Gu Y, Xu J, Li F, Yu G . Differentially private high-dimensional data publication via grouping and truncating techniques. Frontiers of Computer Science, 2019, 13( 2): 382–395
|
[14] |
Shah K, Jinwala D . Privacy preserving secure expansive aggregation with malicious node identification in linear wireless sensor networks. Frontiers of Computer Science, 2021, 15( 6): 156813
|
[15] |
He Q, Xia P, Li B, Liu J B . Evaluating investors’ recognition abilities for risk and profit in online loan markets using nonlinear models and financial big data. Journal of Function Spaces, 2021, 2021: 5178970
|
[16] |
Chen R, Shen Y, Jin H. Private analysis of infinite data streams via retroactive grouping. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 2015, 1061–1070
|
[17] |
Jo G, Jung K, Park S . An adaptive window size selection method for differentially private data publishing over infinite trajectory stream. Journal of Advanced Transportation, 2018, 2018: 8297678
|
[18] |
Zhang Z, Wang H, Xue W, Xia Y . Approach for data streams clustering over dynamic sliding windows. Computer Engineering and Applications, 2011, 47( 7): 135–138
|
[19] |
Wang Q, Zhang Y, Lu X, Wang Z, Qin Z, Ren K . Real-time and spatio-temporal crowd-sourced social network data publishing with differential privacy. IEEE Transactions on Dependable and Secure Computing, 2018, 15( 4): 591–606
|
[20] |
Xiong X, Liu S, Li D, Cai Z, Niu X . Real-time and private spatio-temporal data aggregation with local differential privacy. Journal of Information Security and Applications, 2020, 55: 102633
|
[21] |
Tang P, Cheng X, Su S, Chen R, Shao H . Differentially private publication of vertically partitioned data. IEEE Transactions on Dependable and Secure Computing, 2021, 18( 2): 780–795
|
[22] |
Ye D, Zhu T, Shen S, Zhou W . A differentially private game theoretic approach for deceiving cyber adversaries. IEEE Transactions on Information Forensics and Security, 2021, 16: 569–584
|
[23] |
Xue Q, Zhu Y, Wang J . Mean estimation over numeric data with personalized local differential privacy. Frontiers of Computer Science, 2022, 16( 3): 163806
|
[24] |
Huo Z, He P, Hu L, Zhao H . DP-UserPro: differentially private user profile construction and publication. Frontiers of Computer Science, 2021, 15( 5): 155811
|
[25] |
Li H, Xiong L, Jiang X, Liu J. Differentially private histogram publication for dynamic datasets: an adaptive sampling approach. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 2015, 1001–1010
|
[26] |
Gao R, Ma X. Dynamic data histogram publishing based on differential privacy. In: Proceedings of 2018 IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications. 2018, 737–743
|
[27] |
Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. 2006, 265–284
|
[28] |
Arthur D, Vassilvitskii S. k-means++: the advantages of careful seeding. In: Proceedings of 18th Annual ACM-SIAM Symposium on Discrete Algorithms. 2007, 1027–1035
|
/
〈 | 〉 |