Privacy dilemmas and opportunities in large language models: a brief review
Hongyi LI , Jiawei YE , Jie WU
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (10) : 1910356
Privacy dilemmas and opportunities in large language models: a brief review
The growing number of cases indicates that large language model (LLM) brings transformative advancements while raising privacy concerns. Despite promising recent surveys proposed in the literature, there is still a lack of comprehensive analysis dedicated to text privacy specifically for LLM. By comprehensively collecting LLM privacy research, we summarize five privacy issues and their corresponding solutions during both model training and invocation and extend our analysis to three research focuses in LLM application. Moreover, we propose five further research directions and provide prospects for LLM native security mechanisms. Notably, we find that most LLM privacy research is still in the technical exploration phase, with the hope that this work can assist in LLM privacy development.
large language model / data privacy / data protection
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
Wang T, Zhang Y, Qi S, Zhao R, Xia Z, Weng J. Security and privacy on generative data in AIGC: a survey. ACM Computing Surveys, 2024 |
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
Hu E J, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, Chen W. LoRA: low-rank adaptation of large language models. In: Proceedings of the 10th International Conference on Learning Representations. 2022 |
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
Ullah I, Hassan N, Gill S S, Suleiman B, Ahanger T A, Shah Z, Qadir J, Kanhere S S. Privacy preserving large language models: ChatGPT case study based vision and framework. 2023, arXiv preprint arXiv: 2310.12523 |
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
|
| [56] |
|
| [57] |
|
| [58] |
|
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
|
| [66] |
Borkar J. What can we learn from data leakage and unlearning for law? 2023, arXiv preprint arXiv: 2307.10476 |
| [67] |
|
| [68] |
|
| [69] |
|
| [70] |
|
| [71] |
|
| [72] |
|
| [73] |
|
| [74] |
|
| [75] |
|
| [76] |
|
| [77] |
|
| [78] |
|
| [79] |
|
| [80] |
|
| [81] |
|
| [82] |
|
| [83] |
|
| [84] |
|
| [85] |
|
| [86] |
|
| [87] |
Deng G, Liu Y, Li Y, Wang K, Zhang Y, Li Z, Wang H, Zhang T, Liu Y. MASTERKEY: automated jailbreaking of large language model Chatbots. In: Proc. ISOC NDSS. 2024 |
| [88] |
|
| [89] |
Ai4Privacy. PII Masking 300k Dataset. See huggingface.co/datasets/ai4privacy/pii-masking-300k website, 2024 |
| [90] |
|
| [91] |
|
| [92] |
|
| [93] |
|
| [94] |
|
| [95] |
|
| [96] |
|
| [97] |
|
| [98] |
|
| [99] |
|
| [100] |
|
| [101] |
|
| [102] |
|
| [103] |
|
| [104] |
|
| [105] |
|
| [106] |
|
| [107] |
|
| [108] |
|
| [109] |
|
| [110] |
Meta. Llm-guard. See llm-guard.com website, 2024 |
| [111] |
|
| [112] |
|
| [113] |
|
| [114] |
|
| [115] |
Singh K, Zou J. New evaluation metrics capture quality degradation due to LLM watermarking. 2023, arXiv preprint arXiv: 2312.02382 |
| [116] |
|
| [117] |
Antoun W, Mouilleron V, Sagot B, Seddah D. Towards a robust detection of language model generated text: is ChatGPT that easy to detect? 2023, arXiv preprint arXiv: 2306.05871 |
| [118] |
|
| [119] |
|
| [120] |
|
| [121] |
|
| [122] |
|
| [123] |
Koike R, Kaneko M, Okazaki N. OUTFOX: LLM-generated essay detection through in-context learning with adversarially generated examples. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. 2024, 21258−21266 |
| [124] |
|
| [125] |
|
| [126] |
|
| [127] |
|
| [128] |
|
| [129] |
|
| [130] |
|
| [131] |
|
| [132] |
|
| [133] |
|
| [134] |
|
| [135] |
|
| [136] |
|
| [137] |
|
| [138] |
Chen Y, Mendes E, Das S, Xu W, Ritter A. Can language models be instructed to protect personal information? 2023, arXiv preprint arXiv: 2310.02224 |
| [139] |
|
| [140] |
|
| [141] |
|
| [142] |
|
| [143] |
|
Higher Education Press
/
| 〈 |
|
〉 |