Improving the utility of locally differentially private protocols for longitudinal and multidimensional frequency estimates

Héber H. Arcolezi , Jean-François Couchot , Bechara Al Bouna , Xiaokui Xiao

›› 2024, Vol. 10 ›› Issue (2) : 369 -379.

PDF
›› 2024, Vol. 10 ›› Issue (2) :369 -379. DOI: 10.1016/j.dcan.2022.07.003
Research article
research-article

Improving the utility of locally differentially private protocols for longitudinal and multidimensional frequency estimates

Author information +
History +
PDF

Abstract

This paper investigates the problem of collecting multidimensional data throughout time (i.e., longitudinal studies) for the fundamental task of frequency estimation under Local Differential Privacy (LDP) guarantees. Contrary to frequency estimation of a single attribute, the multidimensional aspect demands particular attention to the privacy budget. Besides, when collecting user statistics longitudinally, privacy progressively degrades. Indeed, the “multiple” settings in combination (i.e., many attributes and several collections throughout time) impose several challenges, for which this paper proposes the first solution for frequency estimates under LDP. To tackle these issues, we extend the analysis of three state-of-the-art LDP protocols (Generalized Randomized Response - GRR, Optimized Unary Encoding - OUE, and Symmetric Unary Encoding - SUE) for both longitudinal and multidimensional data collections. While the known literature uses OUE and SUE for two rounds of sanitization (a.k.a. memoization), i.e., L-OUE and L-SUE, respectively, we analytically and experimentally show that starting with OUE and then with SUE provides higher data utility (i.e., L-OSUE). Also, for attributes with small domain sizes, we propose Longitudinal GRR (L-GRR), which provides higher utility than the other protocols based on unary encoding. Last, we also propose a new solution named Adaptive LDP for LOngitudinal and Multidimensional FREquency Estimates (ALLOMFREE), which randomly samples a single attribute to be sent with the whole privacy budget and adaptively selects the optimal protocol, i.e., either L-GRR or L-OSUE. As shown in the results, ALLOMFREE consistently and considerably outperforms the state-of-the-art L-SUE and L-OUE protocols in the quality of the frequency estimates.

Keywords

Local differential privacy / Discrete distribution estimation / Frequency estimation / Multidimensional data / Longitudinal studies

Cite this article

Download citation ▾
Héber H. Arcolezi, Jean-François Couchot, Bechara Al Bouna, Xiaokui Xiao. Improving the utility of locally differentially private protocols for longitudinal and multidimensional frequency estimates. , 2024, 10(2): 369-379 DOI:10.1016/j.dcan.2022.07.003

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

C. Dwork, F. McSherry, K. Nissim, A. Smith, Calibrating noise to sensitivity in private data analysis, in: Theory of Cryptography, Springer Berlin Heidelberg, 2006, pp. 265-284, https://doi.org/10.1007/11681878_14.

[2]

C. Dwork, Differential privacy, in: Automata, Languages and Programming, Springer Berlin Heidelberg, 2006, pp. 1-12, https://doi.org/10.1007/11787006_1.

[3]

C. Dwork, A. Roth, et al. The algorithmic foundations of differential privacy, Found. Trends® Theor. Comput. Sci., 9 (3-4) (2014) 211-407.

[4]

A. Aktay, S. Bavadekar, G. Cossoul, J. Davis, D. Desfontaines, A. Fabrikant, E. Gabrilovich, K. Gadepalli, B. Gipson, M. Guevara, et al., Google COVID-19 Community Mobility Reports: Anonymization Process Description, 2020 arXiv preprint arXiv: 2004.04145, version 1.1.

[5]

R. Rogers, S. Subramaniam, S. Peng, D. Durfee, S. Lee, S.K. Kancha, S. Sahay, P. Ahammad, Linkedin’s audience engagements API: a privacy preserving data analytics system at scale, J. Priv. Confid. 11 (3) (2021) 1-27.

[6]

M. Abadi, A. Chu, I. Goodfellow, H.B. McMahan, I. Mironov, K. Talwar, L. Zhang, Deep Learning with Differential Privacy, CCS ’16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 308-318, https://doi.org/10.1145/2976749.2978318.

[7]

J.M. Abowd, The U.S. census bureau adopts differential privacy, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, 2018, https://doi.org/10.1145/3219819.3226070.

[8]

S. Garfinkel, Implementing Differential Privacy for the 2020 Census, USENIX Association, 2021.

[9]

D. McCandless, T. Evans, M. Quick, E. Hollowood, C. Miles, D. Hampson, D. Geere, World’s Biggest Data Breaches & Hacks, 2021. https://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/ (accessed 11 march 2021).

[10]

S.P. Kasiviswanathan, H.K. Lee, K. Nissim, S. Raskhodnikova, A. Smith, What can we learn privately?, in: 2008 49th Annual IEEE Symposium on Foundations of Computer Science IEEE, 2008, pp. 531-540.

[11]

U. Erlingsson, V. Pihur, A. Korolova, RAPPOR: randomized aggregatable privacy-preserving ordinal response,in:Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, ACM, 2014, pp. 1054-1067.

[12]

B. Ding, J. Kulkarni, S. Yekhanin, Collecting telemetry data privately, in: I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Advances in Neural Information Processing Systems 30, Curran Associates, Inc.,Eds.), 2017, pp. 3571-3580.

[13]

Apple Differential Privacy Team, Learning with Privacy at Scale, 2017. https://docs-assets.developer.apple.com/ml-research/papers/learning-with-privacy-at-scale.pdf (accessed 11 march 2021).

[14]

T. Wang, J. Blocki, N. Li, S. Jha, Locally differentially private protocols for frequency estimation, in: 26th USENIX Security Symposium (USENIX Security 17), USENIX Association, Vancouver, BC, 2017, pp. 729-745.

[15]

G. Cormode, S. Maddock, C. Maple,Frequency estimation under local differential privacy, Proc. Endow. 14 (11) (2021) 2046-2058.

[16]

T. Murakami, Y. Kawamoto, Utility-Optimized local differential privacy mechanisms for distribution estimation, in: 28th USENIX Security Symposium (USENIX Security 19), USENIX Association, Santa Clara, CA, 2019, pp. 1877-1894.

[17]

S. Wang, Y. Nie, P. Wang, H. Xu, W. Yang, L. Huang,Local private ordinal data distribution estimation, in: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, IEEE, 2017, pp. 1-9.

[18]

P. Kairouz, K. Bonawitz, D. Ramage, Discrete distribution estimation under local privacy, in: International Conference on Machine Learning, PMLR, 2016, pp. 2436-2444.

[19]

J. Acharya, Z. Sun, H. Zhang, Hadamard response: estimating distributions privately, efficiently, and with little communication, in: K. Chaudhuri, M. Sugiyama (Eds.), Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, 89, of Proceedings of Machine Learning Research, PMLR, 2019, pp. 1120-1129.

[20]

M. Alvim, K. Chatzikokolakis, C. Palamidessi, A. Pazii, Invited paper: local differential privacy on metric spaces: optimizing the trade-off with utility, in: 2018 IEEE 31st Computer Security Foundations Symposium (CSF), IEEE, 2018, pp. 262-267.

[21]

D. Zhao, H. Chen, S. Zhao, X. Zhang, C. Li, R. Liu, Local differential privacy with k-anonymous for frequency estimation, in: 2019 IEEE International Conference on Big Data (Big Data), IEEE, 2019, pp. 5819-5828.

[22]

Z. Li, T. Wang, M. Lopuhaä-Zwakenberg, N. Li, B. Škoric, Estimating numerical distributions under local differential privacy, in: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, ACM, 2020, pp. 621-635.

[23]

M. Xu, B. Ding, T. Wang, J. Zhou,Collecting and analyzing data jointly from multiple services under local differential privacy, Proc. Endow. 13 (12) (2020) 2760-2772.

[24]

J. Yang, T. Wang, N. Li, X. Cheng, S. Su,Answering multi-dimensional range queries under local differential privacy, Proc. VLDB Endow. 14 (3) (2020) 378-390.

[25]

X. Gu, M. Li, Y. Cao, L. Xiong, Supporting both range queries and frequency estimation with local differential privacy, in: 2019 IEEE Conference on Communications and Network Security (CNS), IEEE, 2019, pp. 124-132.

[26]

G. Cormode, T. Kulkarni, D. Srivastava,Answering range queries under local differential privacy, Proc. Endow. 12 (10) (2019) 1126-1138.

[27]

Z. Shen, Z. Xia, P. Yu, PLDP: personalized local differential privacy for multidimensional data aggregation, Secur. Commun. Network. 2021 (2021) 1-13.

[28]

F. Peng, S. Tang, B. Zhao, Y. Liu, A privacy-preserving data aggregation of mobile crowdsensing based on local differential privacy, in: Proceedings of the ACM Turing Celebration Conference, ACM, 2019, pp. 1-5.

[29]

Z. Zhang, T. Wang, N. Li, S. He, J. Chen, CALM, Consistent adaptive local marginal for marginal release under local differential privacy, Proc. ACM Conf. Comput. Commun. Secur. (2018) 212-229.

[30]

X. Ren, C.-m. Yu, W. Yu, S. Yang, S. Member, X. Yang, J.A. Mccann, P.S. Yu, L. Fellow, LoPub : High Dimens. Crowdsourced Data 13 (9) (2018) 2151-2166.

[31]

G. Fanti, V. Pihur, Ú. Erlingsson, Building a RAPPOR with the unknown: privacy-preserving learning of associations and data dictionaries, Proc. Priv. Enhanc Technol. 2016 (3) (2016) 41-61.

[32]

T.T. Nguy^en, X. Xiao, Y. Yang, S.C. Hui, H. Shin, J. Shin, Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy, 2016 arXiv preprint arXiv:1606.05053.

[33]

N. Wang, X. Xiao, Y. Yang, J. Zhao, S.C. Hui, H. Shin, J. Shin, G. Yu, Collecting and analyzing multidimensional data with local differential privacy, in: 2019 IEEE 35th International Conference on Data Engineering (ICDE), IEEE, 2019, pp. 638-649.

[34]

J.C. Duchi, M.I. Jordan, M.J. Wainwright, Minimax optimal procedures for locally private estimation, J. Am. Stat. Assoc. 113 (521) (2018) 182-201.

[35]

T. Wang, J. Zhao, Z. Hu, X. Yang, X. Ren, K.-Y. Lam, Local differential privacy for data collection and analysis, Neurocomputing 426 (2021) 114-133.

[36]

Ú. Erlingsson, V. Feldman, I. Mironov, A. Raghunathan, S. Song, K. Talwar, A. Thakurta, Encode, Shuffle, Analyze Privacy Revisited: Formalizations and Empirical Evaluation, 2020 arXiv preprint arXiv: 2001.03618.

[37]

T. Wang, N. Li, S. Jha, Locally differentially private heavy hitter identification, IEEE Trans. Dependable Secure Comput. 18 (2) (2021) 982-993.

[38]

H.H. Arcolezi, J.-F. Couchot, B.A. Bouna, X. Xiao, Longitudinal collection and analysis of mobile phone data with local differential privacy, in: M. Friedewald, S. Schiffner, S. Krenn (Privacy and Identity Management,Eds.), Springer International Publishing, Cham, 2021, pp. 40-57, https://doi.org/10.1007/978-3-030-72465-8_3.

[39]

J.W. Kim, D.-H. Kim, B. Jang, Application of local differential privacy to collection of indoor positioning data, IEEE Access 6 (2018) 4276-4286.

[40]

I.D.C. Vidal, A.L. da Costa Mendonça, F. Rousseau, J.D.C. Machado, ProTECting: an application of local differential privacy for IoT at the edge in smart home scenarios, in: Anais XXXVIII Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos (SBRC 2020), Sociedade Brasileira de Computaç-ao, 2020, https://doi.org/10.5753/sbrc.2020.12308.

[41]

S.L. Warner, Randomized response: a survey technique for eliminating evasive answer bias, J. Am. Stat. Assoc. 60 (309) (1965) 63-69.

[42]

T. Wang, N. Li, S. Jha, Locally differentially private frequent itemset mining, in: 2018 IEEE Symposium on Security and Privacy (SP), IEEE, 2018, pp. 127-143.

[43]

T. Wang, M. Lopuhaa-Zwakenberg, Z. Li, B. Skoric, N. Li, Locally differentially private frequency estimation with consistency, in: Proceedings 2020 Network and Distributed System Security Symposium, Internet Society, 2020, https://doi.org/10.14722/ndss.2020.24157.

[44]

R. Bassily, K. Nissim, U. Stemmer, A. Thakurta, Practical locally private heavy hitters, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Curran Associates Inc., Red Hook, NY, USA, 2017, pp. 2285-2293.

[45]

M. Naor, N. Vexler, Roth (Ed.),Can two walk together: privacy enhancing methods and preventing tracking of users,in:A. 1st Symposium on Foundations of Responsible Computing (FORC 2020), Vol. 156 of Leibniz International Proceedings in Informatics (LIPIcs), Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2020, https://doi.org/10.4230/LIPIcs.FORC.2020.4, 4:1-4:20.

[46]

D. Dua, C. Graff, UCI machine learning repository Pubmed Partial Author stitle stitle Volume Page. http://archive.ics.uci.edu/ml, 2017 (accessed 11 march 2021).

[47]

H.H. Arcolezi, J.-F. Couchot, O. Baala, J.-M. Contet, B.A. Bouna, X. Xiao, Mobility modeling through mobile data: generating an optimized and open dataset respecting privacy, in: 2020 International Wireless Communications and Mobile Computing (IWCMC), IEEE, 2020, pp. 1689-1694.

[48]

R. Bassily, A. Smith, Local, private, efficient protocols for succinct histograms, in: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, ACM, 2015, pp. 127-135.

[49]

X. Xiong, S. Liu, D. Li, Z. Cai, X. Niu, A comprehensive survey on local differential privacy, Secur. Commun. Network. 2020 (2020) 1-29.

[50]

H.H. Arcolezi, J.-F. Couchot, B. Al Bouna, X. Xiao, Random sampling plus fake data: multidimensional frequency estimates with local differential privacy,in:Proceedings of the 30th ACM International Conference on Information & Knowledge Management, ACM, 2021, pp. 47-57.

[51]

S. Kessler, J. Hoff, J.-C. Freytag,SAP HANA goes private, Proc. Endow. 12 (12)(2019) 1998-2009.

[52]

P.C. Mahawaga Arachchige, P. Bertok, I. Khalil, D. Liu, S. Camtepe, M. Atiquzzaman, Local differential privacy for deep learning, IEEE Internet Things J. 7 (7) (2020) 5827-5842.

[53]

X. Zhou, J. Tan,Local differential privacy for bayesian optimization, in:Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, 2021, pp. 11152-11159.

[54]

Z. Qin, Y. Yang, T. Yu, I. Khalil, X. Xiao, K. Ren, Heavy hitter estimation over set-valued data with local differential privacy, in: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, ACM, 2016, pp. 192-203. ACM.

[55]

H.H. Arcolezi, J.-F. Couchot, S. Cerna, C. Guyeux, G. Royer, B.A. Bouna, X. Xiao, Forecasting the number of firefighter interventions per region with local-differential-privacy-based data, Comput. Secur. 96 (2020) 101888.

[56]

E. ElSalamouny, C. Palamidessi, Generalized iterative bayesian update and applications to mechanisms for privacy protection, in: 2020 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2020, pp. 490-507.

AI Summary AI Mindmap
PDF

98

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/