A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly

Yifan Yao , Jinhao Duan , Kaidi Xu , Yuanfang Cai , Zhibo Sun , Yue Zhang

High-Confidence Computing ›› 2024, Vol. 4 ›› Issue (2) : 100211

PDF (1061KB)
High-Confidence Computing ›› 2024, Vol. 4 ›› Issue (2) : 100211 DOI: 10.1016/j.hcc.2024.100211
Review Articles
research-article

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly

Author information +
History +
PDF (1061KB)

Abstract

Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized natural language understanding and generation. They possess deep language comprehension, human-like text generation capabilities, contextual awareness, and robust problem-solving skills, making them invaluable in various domains (e.g., search engines, customer support, translation). In the meantime, LLMs have also gained traction in the security community, revealing security vulnerabilities and showcasing their potential in security-related tasks. This paper explores the intersection of LLMs with security and privacy. Specifically, we investigate how LLMs positively impact security and privacy, potential risks and threats associated with their use, and inherent vulnerabilities within LLMs. Through a comprehensive literature review, the paper categorizes the papers into “The Good” (beneficial LLM applications), “The Bad” (offensive applications), and “The Ugly” (vulnerabilities of LLMs and their defenses). We have some interesting findings. For example, LLMs have proven to enhance code security (code vulnerability detection) and data privacy (data confidentiality protection), outperforming traditional methods. However, they can also be harnessed for various attacks (particularly user-level attacks) due to their human-like reasoning abilities. We have identified areas that require further research efforts. For example, Research on model and parameter extraction attacks is limited and often theoretical, hindered by LLM parameter scale and confidentiality. Safe instruction tuning, a recent development, requires more exploration. We hope that our work can shed light on the LLMs’ potential to both bolster and jeopardize cybersecurity.

Keywords

Large Language Model (LLM) / LLM security / LLM privacy / ChatGPT / LLM attacks / LLM vulnerabilities

Cite this article

Download citation ▾
Yifan Yao, Jinhao Duan, Kaidi Xu, Yuanfang Cai, Zhibo Sun, Yue Zhang. A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly. High-Confidence Computing, 2024, 4(2): 100211 DOI:10.1016/j.hcc.2024.100211

登录浏览全文

4963

注册一个新账户 忘记密码

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research was supported partly by the National Science Foundation award FMitF-2319242. Any opinions, findings, conclusions, or recommendations expressed are those of the authors and not necessarily of the NSF.

References

[1]

J. Yang, H. Jin, R. Tang, X. Han, Q. Feng, H. Jiang, B. Yin, X. Hu, Harnessing the power of llms in practice: A survey on chatgpt and beyond, 2023, arXiv preprint arXiv:2304.13712.

[2]

OpenAI, GPT-4 technical report, 2023, https://arxiv.org/abs/2303.08774.

[3]

Meta AI, Introducing llama: A foundational, 65-billion-parameter language model, 2023, https://ai.meta.com/blog/large-language-model-llama-meta-ai/. (Accessed 13 November 2023).

[4]

Databricks, Free dolly: Introducing the world’s first open and commercially viable instruction-tuned LLM, 2023, https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm. (Accessed 13 November 2023).

[5]

Fabio Duarte, Number of ChatGPT users (nov 2023), 2023, https://explodingtopics.com/blog/chatgpt-users. (Accessed 13 November 2023).

[6]

N. Ziems, W. Yu, Z. Zhang, M. Jiang, Large language models are built-in autoregressive search engines, 2023, arXiv preprint arXiv:2305.09612.

[7]

B.B. Arcila, Is it a platform? Is it a search engine? It’s ChatGPT! the European liability regime for large language models, J. Free Speech L. 3 (2023) 455.

[8]

S.E. Spatharioti, D.M. Rothschild, D.G. Goldstein, J.M. Hofman, Comparing traditional and LLM-based search for consumer choice: A randomized experiment, 2023, arXiv preprint arXiv:2307.03744.

[9]

B. Yao, M. Jiang, D. Yang, J. Hu, Empowering LLM-based machine translation with cultural awareness, 2023, arXiv preprint arXiv:2305. 14328.

[10]

M. Karpinska, M. Iyyer, Large language models effectively leverage document-level context for literary translation, but critical errors persist, 2023, arXiv preprint arXiv:2304.03245.

[11]

R. Jain, N. Gervasoni, M. Ndhlovu, S. Rawat, A code centric evaluation of C/C++ vulnerability datasets for deep learning based vulnerability detection techniques, in:Proceedings of the 16th Innovations in Software Engineering Conference, 2023, pp. 1-10.

[12]

A.J. Thirunavukarasu, D.S.J. Ting, K. Elangovan, L. Gutierrez, T.F. Tan, D.S.W. Ting, Large language models in medicine, Nature medicine 29 (8) (2023) 1930-1940.

[13]

S. Wu, O. Irsoy, S. Lu, V. Dabravolski, M. Dredze, S. Gehrmann, P. Kambadur, D. Rosenberg, G. Mann, Bloomberggpt: A large language model for finance, 2023, arXiv preprint arXiv:2303.17564.

[14]

A.B. Mbakwe, I. Lourentzou, L.A. Celi, O.J. Mechanic, A. Dagan, ChatGPT passing USMLE shines a spotlight on the flaws of medical education, PLOS Digital Health 2 (2) (2023) e0000205.

[15]

Chris Koch, I used GPT-3 to find 213 security vulnerabilities in a single codebase, 2023, http://surl.li/ncjvo.

[16]

H. Pearce, B. Tan, B. Ahmad, R. Karri, B. Dolan-Gavitt, Examining zero-shot vulnerability repair with large language models, in: 2023 IEEE Symposium on Security and Privacy, SP, 2023, pp. 2339-2356.

[17]

C.S. Xia, M. Paltenghi, J.L. Tian, M. Pradel, L. Zhang, Universal fuzzing via large language models, 2023, arXiv preprint arXiv:2308.04748.

[18]

W.X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, et al. , A survey of large language models, 2023, arXiv preprint arXiv:2303.18223.

[19]

S.Y. Feng, V. Gangal, J. Wei, S. Chandar, S. Vosoughi, T. Mitamura, E. Hovy, A survey of data augmentation approaches for NLP, 2021, arXiv preprint arXiv:2105.03075.

[20]

C. Novelli, F. Casolari, A. Rotolo, M. Taddeo, L. Floridi, Taking AI risks seriously: a new assessment model for the AI act, AI & Society (2023) 1-5.

[21]

Y. Cai, S. Mao, W. Wu, Z. Wang, Y. Liang, T. Ge, C. Wu, W. You, T. Song, Y. Xia, et al. , Low-code LLM: Visual programming over LLMs, 2023, arXiv preprint arXiv:2304.08103.

[22]

Jorge Torres, Navigating the LLM landscape: A comparative analysis of leading large language models, 2023, http://surl.li/ncjvc.

[23]

Sapling, LLM index, 2023, https://sapling.ai/llm/index.

[24]

X. Ding, L. Chen, M. Emani, C. Liao, P.-H. Lin, T. Vanderbruggen, Z. Xie, A. Cerpa, W. Du, HPC-gpt: Integrating large language model for high-performance computing, in: Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, in: SC-W 2023, ACM, 2023, http://dx.doi.org/10.1145/3624062.3624172.

[25]

T.B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D.M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Language models are few-shot learners, 2020.

[26]

P. Liang, R. Bommasani, T. Lee, D. Tsipras, D. Soylu, M. Yasunaga, Y. Zhang, D. Narayanan, Y. Wu, A. Kumar, B. Newman, B. Yuan, B. Yan, C. Zhang, C. Cosgrove, C.D. Manning, C. , D. Acosta-Navas, D.A. Hudson, E. Zelikman, E. Durmus, F. Ladhak, F. Rong, H. Ren, H. Yao, J. Wang, K. Santhanam, L. Orr, L. Zheng, M. Yuksekgonul, M. Suzgun, N. Kim, N. Guha, N. Chatterji, O. Khattab, P. Henderson, Q. Huang, R. Chi, S.M. Xie, S. Santurkar, S. Ganguli, T. Hashimoto, T. Icard, T. Zhang, V. Chaudhary, W. Wang, X. Li, Y. Mai, Y. Zhang, Y. Koreeda, Holistic evaluation of language models, 2023, arXiv:2211.09110.

[27]

J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, 2019.

[28]

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P.J. Liu, Exploring the limits of transfer learning with a unified text-to-text transformer, 2023.

[29]

S. Narang, A. Chowdhery, Pathways language model (PaLM): Scaling to 540 billion parameters for breakthrough performance, 2022, https://blog.research.google/2022/04/pathways-language-model-palm-scaling-to.html. (Accessed 13 November 2023).

[30]

Salesforce A.I. Research, Introducing a conditional transformer language model for controllable generation, 2023, https://shorturl.at/azQW6. (Accessed 13 November 2023).

[31]

G. Sandoval, H. Pearce, T. Nys, R. Karri, S. Garg, B. Dolan-Gavitt, Lost at C: A user study on the security implications of large language model code assistants, in: USENIX Security 2023, 2023, For associated dataset see [this URL](https://arxiv.org/abs/2208.09727). 18 pages, 12 figures. G. Sandoval and H. Pearce contributed equally to this work.[Online]. Available: https://arxiv.org/abs/2208.09727.

[32]

J. He, M. Vechev, Large language models for code: Security hardening and adversarial testing,in:ICML 2023 Workshop DeployableGenera-tiveAI, 2023, Keywords: large language models, code generation, security, prompt tuning.

[33]

M.L. Siddiq, J.C.S. Santos, Generate and pray: Using SALLMS to evaluate the security of llm generated code, 2023, 16 pages. [Online]. Available: https://arxiv.org/abs/2311.00889.

[34]

M. Nair, R. Sadhukhan, D. Mukhopadhyay, Generating secure hardware using ChatGPT resistant to CWEs, 2023, Cryptology ePrint Archive, Paper 2023/212. [Online]. Available: https://eprint.iacr.org/2023/212.

[35]

Y. Zhang, W. Song, Z. Ji, D.D. Yao, N. Meng, How well does LLM generate security tests? 2023, arXiv preprint arXiv:2310.00710.

[36]

S. Kang, J. Yoon, S. Yoo, LLM Lies: Hallucinations are not bugs, but features as adversarial examples, in: 2023 IEEE/ACM 45th International Conference on Software Engineering, ICSE, IEEE, 2023.

[37]

Y. Deng, C.S. Xia, H. Peng, C. Yang, L. Zhang, Fuzzing deep-learning libraries via large language models, 2022, arXiv preprint arXiv:2212. 14834.

[38]

Y. Deng, C.S. Xia, C. Yang, S.D. Zhang, S. Yang, L. Zhang, Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt, 2023, arXiv preprint arXiv:2304.02014.

[39]

C. Yang, Y. Deng, R. Lu, J. Yao, J. Liu, R. Jabbarvand, L. Zhang, White-box compiler fuzzing empowered by large language models, 2023.

[40]

C. Zhang, M. Bai, Y. Zheng, Y. Li, X. Xie, Y. Li, W. Ma, L. Sun, Y. Liu, Understanding large language model based fuzz driver generation, 2023, arXiv preprint arXiv:2307.12469.

[41]

R. Meng, M. Mirchev, M. Böhme, A. Roychoudhury, Large language model guided protocol fuzzing, in:Proceedings of the 31th Annual Network and Distributed System Security Symposium, NDSS’24, 2024.

[42]

P. Henrik, LLM-assisted malware review: AI and humans join forces to combat malware, 2023, https://shorturl.at/loqT4. (Accessed 13 November 2023).

[43]

S. Eli, D. Gil, Self-enhancing pattern detection with LLMs: Our answer to uncovering malicious packages at scale, 2023, https://apiiro.com/blog/llm-code-pattern-malicious-package-detection/. (Accessed 13 November 2023).

[44]

D. Noever, Can large language models find and fix vulnerable software? 2023, http://dx.doi.org/10.48550/arXiv.2308.10345, arXiv preprint arXiv:2308.10345.

[45]

A. Bakhshandeh, A. Keramatfar, A. Norouzi, M.M. Chekidehkhoun, Using ChatGPT as a static application security testing tool, 2023, arXiv preprint arXiv:2308.14434.

[46]

M.D. Purba, A. Ghosh, B.J. Radford, B. Chu, Software vulnerability detection using large language models, in: 2023 IEEE 34th International Sympo-sium on Software Reliability Engineering Workshops, ISSREW, 2023, pp. 112-119.

[47]

A. Cheshkov, P. Zadorozhny, R. Levichev, Evaluation of ChatGPT model for vulnerability detection, 2023, http://dx.doi.org/10.48550/arXiv.2304.07232, arXiv preprint arXiv:2304.07232.

[48]

P. Liu, C. Sun, Y. Zheng, X. Feng, C. Qin, Y. Wang, Z. Li, L. Sun, Harnessing the power of LLM to support binary taint analysis, 2023.

[49]

J. Wang, Z. Huang, H. Liu, N. Yang, Y. Xiao, DefectHunter: A novel LLM-driven boosted-conformer-based code vulnerability detection mechanism, 2023, http://dx.doi.org/10.48550/arXiv.2309.15324, arXiv preprint arXiv: 2309.15324.

[50]

C. Chen, J. Su, J. Chen, Y. Wang, T. Bi, Y. Wang, X. Lin, T. Chen, Z. Zheng, When ChatGPT meets smart contract vulnerability detection: How far are we? 2023, http://dx.doi.org/10.48550/arXiv.2309.05520, arXiv preprint arXiv:2309.05520.

[51]

S. Hu, T. Huang, F. İlhan, S.F. Tekin, L. Liu, Large language model-powered smart contract vulnerability detection: New perspectives, 2023, http://dx.doi.org/10.48550/arXiv.2310.01152,10 pages. arXiv preprint arXiv: 2310.01152.

[52]

S. Sakaoglu, KARTAL: Web application vulnerability hunting using large language models, 2023, 85+8, Master’s Programme in Security and Cloud Computing (SECCLO). [Online]. Available: http://urn.fi/URN:NBN:fi:aalto-202308275121.

[53]

T. Chen, L. Li, L. Zhu, Z. Li, G. Liang, D. Li, Q. Wang, T. Xie, VulLibGen: Identifying vulnerable third-party libraries via generative pre-trained model, 2023, http://dx.doi.org/10.48550/arXiv.2308.04662, arXiv preprint arXiv:2308.04662.

[54]

B. Ahmad, S. Thakur, B. Tan, R. Karri, H. Pearce, Fixing hardware security bugs with large language models, 2023, http://dx.doi.org/10.48550/arXiv.2302.01215, arXiv preprint arXiv:2302.01215.

[55]

M. Jin, S. Shahriar, M. Tufano, X. Shi, S. Lu, N. Sundaresan, A. Svyatkovskiy, InferFix: End-to-end program repair with LLMs, 2023.

[56]

M. Fu, C. Tantithamthavorn, V. Nguyen, T. Le, ChatGPT for vulnerability detection, classification, and repair: How far are we?, 2023.

[57]

D. Sobania, M. Briesch, C. Hanna, J. Petke, An analysis of the automatic bug fixing performance of ChatGPT, 2023.

[58]

N. Jiang, K. Liu, T. Lutellier, L. Tan, Impact of code language models on automated program repair, 2023.

[59]

T. Espinha Gasiba, K. Oguzhan, I. Kessba, U. Lechner, M. Pinto-Albuquerque, I’m sorry dave, i’m afraid I can’t fix your code: On ChatGPT, CyberSecurity, and secure coding, in:4th International Computer Programming Education Conference, ICPEC 2023, Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2023.

[60]

H. Ding, V. Kumar, Y. Tian, Z. Wang, R. Kwiatkowski, X. Li, M.K. Ramanathan, B. Ray, P. Bhatia, S. Sengupta, et al. , A static evaluation of code completion by large language models, 2023, arXiv preprint arXiv: 2306.03203.

[61]

P. Vaithilingam, T. Zhang, E.L. Glassman, Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models,in: Chi Conference on Human Factors in Computing Systems Extended Abstracts, 2022, pp. 1-7.

[62]

A. Ni, S. Iyer, D. Radev, V. Stoyanov, W.-t. Yih, S. Wang, X.V. Lin, Lever: Learning to verify language-to-code generation with execution,in:International Conference on Machine Learning, PMLR, 2023, pp. 26106-26128.

[63]

Q. Gu, LLM-based code generation method for golang compiler testing, 2023.

[64]

J. He, M. Vechev, Large language models for code: Security hardening and adversarial testing,in: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 1865-1879.

[65]

B. Chen, F. Zhang, A. Nguyen, D. Zan, Z. Lin, J.-G. Lou, W. Chen, Codet: Code generation with generated tests, 2022, arXiv preprint arXiv:2207. 10397.

[66]

S. Alagarsamy, C. Tantithamthavorn, A. Aleti, A3Test: Assertion-augmented automated test case generation, 2023, arXiv preprint arXiv: 2302.10352.

[67]

M. Schäfer, S. Nadi, A. Eghbali, F. Tip, Adaptive test generation using a large language model, 2023, arXiv preprint arXiv:2302.06527.

[68]

Z. Xie, Y. Chen, C. Zhi, S. Deng, J. Yin, ChatUniTest: a ChatGPT-based automated unit test generation tool, 2023, arXiv preprint arXiv:2305. 04764.

[69]

C. Lemieux, J.P. Inala, S.K. Lahiri, S. Sen, CODAMOSA: Escaping coverage plateaus in test generation with pre-trained large language models,in:International Conference on Software Engineering, ICSE, 2023.

[70]

M.L. Siddiq, J. Santos, R.H. Tanvir, N. Ulfat, F.A. Rifat, V.C. Lopes, Exploring the effectiveness of large language models in generating unit tests, 2023, arXiv preprint arXiv:2305.00418.

[71]

Z. Yuan, Y. Lou, M. Liu, S. Ding, K. Wang, Y. Chen, X. Peng, No more manual tests? Evaluating and improving ChatGPT for unit test generation, 2023, arXiv preprint arXiv:2305.04207.

[72]

S. Yang, Crafting Unusual Programs for Fuzzing Deep Learning Libraries (Ph.D. thesis), University of Illinois at Urbana-Champaign, 2023.

[73]

J. Hu, Q. Zhang, H. Yin, Augmenting greybox fuzzing with generative AI, 2023, arXiv preprint arXiv:2306.06782.

[74]

J. Zhao, Y. Rong, Y. Guo, Y. He, H. Chen, Understanding programs by exploiting (fuzzing) test cases, 2023, arXiv preprint arXiv:2305.13592.

[75]

Z. Tay, Using Artificial Intelligence to Augment Bug Fuzzing, Nanyang Technological University, 2023.

[76]

Y. Deng, C.S. Xia, C. Yang, S.D. Zhang, S. Yang, L. Zhang, Large language models are edge-case generators: Crafting unusual programs for fuzzing deep learning libraries,in:2024 IEEE/ACM 46th International Conference on Software Engineering, ICSE, 2024, pp. 830-842.

[77]

V.-T. Pham, M. Böhme, A. Roychoudhury, Aflnet: a greybox fuzzer for network protocols, in: 2020 IEEE 13th International Conference on Software Testing, Validation and Verification, ICST, IEEE, 2020, pp. 460-465.

[78]

S. Qin, F. Hu, Z. Ma, B. Zhao, T. Yin, C. Zhang, NSFuzz: Towards efficient and state-aware network service fuzzing, ACM Trans. Softw. Eng. Methodol. (2023).

[79]

R. Helmke, J. vom Dorp, Check for extended abstract: Towards reliable and scalable linux kernel CVE attribution in automated static firmware analyses,in: Detection of Intrusions and Malware, and Vulnerability Assessment: 20th International Conference, DIMVA 2023, Hamburg, Germany, July 12-14, 2023, Proceedings, vol. 13959, Springer Nature, 2023, p. 201.

[80]

H. Wen, Y. Li, G. Liu, S. Zhao, T. Yu, T.J.-J. Li, S. Jiang, Y. Liu, Y. Zhang, Y. Liu, Empowering llm to use smartphone for intelligent task automation, 2023, arXiv preprint arXiv:2308.15272.

[81]

G. Deng, Y. Liu, V. Mayoral-Vilches, P. Liu, Y. Li, Y. Xu, T. Zhang, Y. Liu, M. Pinzger, S. Rass, PentestGPT: An LLM-empowered automatic penetration testing tool, 2023, arXiv preprint arXiv:2308.06782.

[82]

F. Wang, Using large language models to mitigate ransomware threats, 2023, http://dx.doi.org/10.20944/preprints202311.0676.v1, Preprints

[83]

T. McIntosh, T. Liu, T. Susnjak, H. Alavizadeh, A. Ng, R. Nowrozy, P. Watters, Harnessing GPT-4 for generation of cybersecurity GRC policies: A focus on ransomware attack mitigation, Comput. Secur. 134 (2023) 103424.

[84]

A. Elhafsi, R. Sinha, C. Agia, E. Schmerling, I.A. Nesnas, M. Pavone, Semantic anomaly detection with large language models, Auton. Robots (2023) 1-21.

[85]

T. Ali, P. Kostakos, HuntGPT: Integrating machine learning-based anomaly detection and explainable AI with large language models (LLMs), 2023, arXiv preprint arXiv:2309.16021.

[86]

C. Egersdoerfer, D. Zhang, D. Dai, Early exploration of using ChatGPT for log-based anomaly detection on parallel file systems logs, 2023.

[87]

Z. Gu, B. Zhu, G. Zhu, Y. Chen, M. Tang, J. Wang, Anomalygpt: Detecting industrial anomalies using large vision-language models, 2023, arXiv preprint arXiv:2308.15366.

[88]

J. Qi, S. Huang, Z. Luan, C. Fung, H. Yang, D. Qian, LogGPT: Exploring ChatGPT for log-based anomaly detection, 2023, arXiv preprint arXiv: 2309.01189.

[89]

A. Vats, Z. Liu, P. Su, D. Paul, Y. Ma, Y. Pang, Z. Ahmed, O. Kalinli, Recovering from privacy-preserving masking with large language models, 2023.

[90]

T. Koide, N. Fukushi, H. Nakano, D. Chiba, Detecting phishing sites using ChatGPT, 2023, arXiv preprint arXiv:2306.05816.

[91]

F. Heiding, B. Schneier, A. Vishwanath, J. Bernstein, Devising and detecting phishing: Large language models vs. Smaller human models, 2023.

[92]

S. Jamal, H. Wimmer, An improved transformer-based model for detecting phishing, spam, and ham: A large language model approach, 2023.

[93]

H. Kwon, M. Sim, G. Song, M. Lee, H. Seo, Novel approach to cryptography implementation using ChatGPT, 2023, Cryptology ePrint Archive, Paper 2023/606. [Online]. Available: https://eprint.iacr.org/2023/606.

[94]

M. Scanlon, F. Breitinger, C. Hargreaves, J.-N. Hilgert, J. Sheppard, ChatGPT for digital forensic investigation: The good, the bad, and the unknown, Forensic Sci. Int. Digit. Invest. (ISSN: 2666-2817) 46 (2023) 301609, [Online]. Available: https://www.sciencedirect.com/science/article/pii/S266628172300121X.

[95]

M. Sladić V. Valeros, C. Catania, S. Garcia, LLM in the shell: Generative honeypots, 2023.

[96]

J. Wang, X. Lu, Z. Zhao, Z. Dai, C.-S. Foo, S.-K. Ng, B.K.H. Low, WASA: Watermark-based source attribution for large language model-generated data, 2023.

[97]

R. Zhang, S.S. Hussain, P. Neekhara, F. Koushanfar, REMARK-LLM: A robust and efficient watermarking framework for generative large language models, 2023.

[98]

T. Lee, S. Hong, J. Ahn, I. Hong, H. Lee, S. Yun, J. Shin, G. Kim, Who wrote this code? Watermarking for code generation, 2023.

[99]

C.S. Xia, Y. Wei, L. Zhang, Practical program repair in the era of large pre-trained language models, 2022.

[100]

C.S. Xia, L. Zhang, Keep the conversation going: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT, 2023.

[101]

C. Peris, C. Dupuy, J. Majmudar, R. Parikh, S. Smaili, R. Zemel, R. Gupta, Privacy in the time of language models, in:Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023, pp. 1291-1292.

[102]

G. Sebastian, Privacy and data protection in ChatGPT and other AI chatbots: Strategies for securing user information, 2023, Available at SSRN 4454761.

[103]

M. Abbasian, I. Azimi, A.M. Rahmani, R. Jain, Conversational health agents: A personalized LLM-powered agent framework, 2023.

[104]

M. Raeini, Privacy-preserving large language models (PPLLMs), 2023, Available at SSRN 4512071.

[105]

J. Majmudar, C. Dupuy, C. Peris, S. Smaili, R. Gupta, R. Zemel, Differentially private decoding in large language models, 2022, arXiv preprint arXiv: 2205.13621.

[106]

Y. Li, Z. Tan, Y. Liu, Privacy-preserving prompt tuning for large language model services, 2023, arXiv preprint arXiv:2305.06212.

[107]

W. Kuang, B. Qian, Z. Li, D. Chen, D. Gao, X. Pan, Y. Xie, Y. Li, B. Ding, J. Zhou, Federatedscope-llm: A comprehensive package for finetuning large language models in federated learning, 2023, arXiv preprint arXiv:2309.00363.

[108]

J. Jiang, X. Liu, C. Fan, Low-parameter federated learning with large language models, 2023, arXiv preprint arXiv:2307.13896.

[109]

T. Fan, Y. Kang, G. Ma, W. Chen, W. Wei, L. Fan, Q. Yang, FATE-LLM: A industrial grade federated learning framework for large language models, 2023, arXiv preprint arXiv:2310.10049.

[110]

K. Stephens, Researchers test large language model that preserves patient privacy, AXIS Imaging News (2023).

[111]

Z. Li, C. Wang, S. Wang, C. Gao, Protecting intellectual property of large language model-based code generation apis via watermarks, in:Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 2336-2350.

[112]

R. Spreitzer, V. Moonsamy, T. Korak, S. Mangard, Systematic classification of side-channel attacks: A case study for mobile devices, IEEE Commun. Surv. Tutor. 20 (1) (2017) 465-488.

[113]

B. Hettwer, S. Gehrer, T. Güneysu, Applications of machine learning techniques in side-channel attacks: a survey, J. Cryptogr. Eng. 10 (2020) 135-162.

[114]

M. Méndez Real, R. Salvador, Physical side-channel attacks on embedded neural networks: A survey, Appl. Sci. 11 (15) (2021) 6790.

[115]

F. Yaman, et al. , AgentSCA: Advanced physical side channel analysis agent with LLMs, 2023.

[116]

V.M. Igure, R.D. Williams, Taxonomies of attacks and vulnerabilities in computer systems, IEEE Commun. Surv. Tutor. 10 (1) (2008) 6-19.

[117]

T. Vidas, D. Votipka, N. Christin, All your droid are belong to us: A survey of current android attacks,in: 5th USENIX Workshop on Offensive Technologies, WOOT 11, 2011.

[118]

C. Joshi, U.K. Singh, K. Tarey, A review on taxonomies of attacks and vulnerability in computer and network system, Int. J. 5 (1) (2015).

[119]

A. Happe, J. Cito, Getting pwn’d by AI: Penetration testing with large language models, 2023, arXiv preprint arXiv:2308.00121.

[120]

A. Happe, A. Kaplan, J. Cito, Evaluating LLMs for privilege-escalation scenarios, 2023.

[121]

S. Paria, A. Dasgupta, S. Bhunia, DIVAS: An LLM-based end-to-end framework for SoC security analysis and policy-based protection, 2023, arXiv preprint arXiv:2308.06932.

[122]

H. Pearce, B. Tan, P. Krishnamurthy, F. Khorrami, R. Karri, B. Dolan-Gavitt, Pop quiz! can a large language model help with reverse engineering?, 2022.

[123]

P.V.S. Charan, H. Chunduri, P.M. Anand, S.K. Shukla, From text to MITRE techniques: Exploring the malicious use of large language models for generating cyber attack payloads, 2023.

[124]

M. Beckerich, L. Plein, S. Coronado, Ratgpt: Turning online LLMs into proxies for malware attacks, 2023.

[125]

Y.M. Pa Pa, S. Tanizaki, T. Kou, M. Van Eeten, K. Yoshioka, T. Matsumoto, An attacker’s dream? exploring the capabilities of chatgpt for developing malware, in:Proceedings of the 16th Cyber Security Experimentation and Test Workshop, 2023, pp. 10-18.

[126]

A. Monje, A. Monje, R.A. Hallman, G. Cybenko, Being a bad influence on the kids: Malware generation in less than five minutes using ChatGPT, 2023.

[127]

M. Botacin, Gpthreats-3: Is automatic malware generation a threat? in: 2023 IEEE Security and Privacy Workshops, SPW, IEEE, 2023, pp. 238-254.

[128]

S. Ben-Moshe, G. Gekker, G. Cohen, OpwnAI: AI that can save the day or HACK it away. Check point research (2022), 2023.

[129]

M. Chowdhury, N. Rifat, S. Latif, M. Ahsan, M.S. Rahman, R. Gomes, ChatGPT: The curious case of attack vectors’ supply chain management improvement, in: 2023 IEEE International Conference on Electro Information Technology, EIT, 2023, pp. 499-504.

[130]

T. Langford, B. Payne, Phishing faster: Implementing ChatGPT into phishing campaigns,in:Proceedings of the Future Technologies Conference, Springer, 2023, pp. 174-187.

[131]

J. Hazell, Large language models can be used to effectively scale spear phishing campaigns, 2023.

[132]

H. Wang, X. Luo, W. Wang, X. Yan, Bot or human? Detecting ChatGPT imposters with a single question, 2023.

[133]

A. Sarabi, T. Yin, M. Liu, An LLM-based framework for fingerprinting internet-connected devices, in:Proceedings of the 2023 ACM on Internet Measurement Conference, 2023, pp. 478-484.

[134]

OWASP, OWASP Top 10 for LLM, [Online]. Available: https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-v1_1.pdf.

[135]

C. Chen, K. Shu, Can LLM-generated misinformation be detected?, 2023.

[136]

J. Wu, B. Hooi, Fake news in sheep’s clothing: Robust fake news detection against LLM-empowered style attacks, 2023.

[137]

J. Yang, H. Xu, S. Mirzoyan, T. Chen, Z. Liu, W. Ju, L. Liu, M. Zhang, S. Wang, Poisoning scientific knowledge using large language models, 2023, pp. 2011-2023, bioRxiv.

[138]

A. Uchendu, J. Lee, H. Shen, T. Le, T.-H.K. Huang, D. Lee, Does human collaboration enhance the accuracy of identifying LLM-generated deepfake texts? in:Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 11, No. 1, 2023, pp. 163-174, [Online]. Available: https://ojs.aaai.org/index.php/HCOMP/article/view/27557.

[139]

Y. Chen, A. Arunasalam, Z.B. Celik, Can large language models provide security & privacy advice? Measuring the ability of LLMs to refute misconceptions, 2023.

[140]

Y. Sun, J. He, S. Lei, L. Cui, C.-T. Lu, Med-MMHL: A multi-modal dataset for detecting human-and LLM-generated misinformation in the medical domain, 2023.

[141]

C. Chen, K. Shu, Combating misinformation in the age of LLMs: Opportunities and challenges, 2023.

[142]

X. Zhang, W. Gao, Towards LLM-based fact verification on news claims with a hierarchical step-by-step prompting method, 2023.

[143]

A.-R. Bhojani, M. Schwarting, Truth and regret: Large language models, the quran, and misinformation, Theology Sci. (2023) 1-7.

[144]

J.A. Leite, O. Razuvayevskaya, K. Bontcheva, C. Scarton, Detecting misinformation with LLM-predicted credibility signals and weak supervision, 2023.

[145]

J. Su, T.Y. Zhuo, J. Mansurov, D. Wang, P. Nakov, Fake news detectors are biased against texts generated by large language models, 2023.

[146]

R. Staab, M. Vero, M. Balunović, M. Vechev, Beyond memorization: Violating privacy via inference with large language models, 2023.

[147]

M. Tong, K. Chen, Y. Qi, J. Zhang, W. Zhang, N. Yu, PrivInfer: Privacy-preserving inference for black-box large language model, 2023.

[148]

P.V. Falade, Decoding the threat landscape: Chatgpt, fraudgpt, and WormGPT in social engineering attacks, Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol. (ISSN: 2456-3307) (2023) 185-198, http://dx.doi.org/10.32628/cseit2390533.

[149]

D.R. Cotton, P.A. Cotton, J.R. Shipway, Chatting and cheating: Ensuring academic integrity in the era of ChatGPT, Innov. Educ. Teach. Int. (2023) 1-12.

[150]

M. Sullivan, A. Kelly, P. McLaughlan, ChatGPT in Higher Education: Considerations for Academic Integrity and Student Learning, Kaplan Higher Education Academy, Singapore, 2023.

[151]

M. Perkins, Academic integrity considerations of AI large language models in the post-pandemic era: Chatgpt and beyond, J. Univ. Teach. Learn. Pract. 20 (2) (2023) 07.

[152]

G.M. Currie, Academic integrity and artificial intelligence: is ChatGPT hype, hero or heresy? in: Seminars in Nuclear Medicine, Elsevier, 2023.

[153]

C.K. Lo, What is the impact of ChatGPT on education? A rapid review of the literature, Educ. Sci. 13 (4) (2023) 410.

[154]

D.O. Eke, ChatGPT and the rise of generative AI: threat to academic integrity? J. Responsible Technol. 13 (2023) 100060.

[155]

S. Nikolic, S. Daniel, R. Haque, M. Belkina, G.M. Hassan, S. Grundy, S. Lyden, P. Neal, C. Sandison, ChatGPT versus engineering education assessment: a multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity, Eur. J. Eng. Educ. (2023) 1-56.

[156]

M.A. Quidwai, C. Li, P. Dube, Beyond black box AI-generated plagiarism detection: From sentence to document level, 2023, arXiv preprint arXiv: 2306.08122.

[157]

C.A. Gao, F.M. Howard, N.S. Markov, E.C. Dyer, S. Ramesh, Y. Luo, A.T. Pearson, Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers, 2022, pp. 2012-2022, bioRxiv.

[158]

M. Khalil, E. Er, Will ChatGPT get you caught? Rethinking of plagiarism detection, 2023, arXiv preprint arXiv:2302.04335.

[159]

M.M. Rahman, Y. Watanobe, ChatGPT for education and research: Opportunities, threats, and strategies, Appl. Sci. 13 (9) (2023) 5783.

[160]

L. Uzun, ChatGPT and academic integrity concerns: Detecting artificial intelligence generated content, Lang. Educ. Technol. 3 (1) (2023).

[161]

R.J.M. Ventayen, Openai ChatGPT generated results: Similarity index of artificial intelligence-based contents, 2023, Available at SSRN 4332664.

[162]

R.J. Rosyanafi, G.D. Lestari, H. Susilo, W. Nusantara, F. Nuraini, The dark side of innovation: Understanding research misconduct with chat GPT in nonformal education studies at universitas negeri surabaya, J. Rev. Pendidikan Dasar J. Kajian Pendidikan Hasil Penelitian 9 (3) (2023) 220-228.

[163]

K. Kumari, A. Pegoraro, H. Fereidooni, A.-R. Sadeghi, DEMASQ: Unmasking the ChatGPT wordsmith, 2023, arXiv preprint arXiv:2311.05019.

[164]

K. Kumari, A. Pegoraro, H. Fereidooni, A.-R. Sadeghi, DEMASQ: Unmasking the ChatGPT wordsmith,in: Proceedings of the 31th Annual Network and Distributed System Security Symposium, NDSS’24, 2024.

[165]

Z. Amos, What is fraudgpt?, 2023, https://hackernoon.com/what-is-fraudgpt.

[166]

D. Delley, WormGPT - The generative AI tool cybercriminals are using to launch business email compromise attacks, 2023, https://shorturl.at/iwFL7.

[167]

K. Kurita, P. Michel, G. Neubig, Weight poisoning attacks on pre-trained models, 2020, arXiv preprint arXiv:2004.06660.

[168]

A. Wan, E. Wallace, S. Shen, D. Klein, Poisoning language models during instruction tuning, 2023, arXiv preprint arXiv:2305.00944.

[169]

E. Wallace, T.Z. Zhao, S. Feng, S. Singh, Concealed data poisoning attacks on NLP models, 2020, arXiv preprint arXiv:2010.12563.

[170]

H. Aghakhani, W. Dai, A. Manoel, X. Fernandes, A. Kharkar, C. Kruegel, G. Vigna, D. Evans, B. Zorn, R. Sim, TrojanPuzzle: Covertly poisoning code-suggestion models, 2023, arXiv preprint arXiv:2301.02344.

[171]

Y. Wan, S. Zhang, H. Zhang, Y. Sui, G. Xu, D. Yao, H. Jin, L. Sun, You see what I want you to see: poisoning vulnerabilities in neural code search,in: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2022, pp. 1233-1245.

[172]

R. Schuster, C. Song, E. Tromer, V. Shmatikov, You autocomplete me: Poisoning vulnerabilities in neural code completion,in:30th USENIX Security Symposium, USENIX Security 21, 2021, pp. 1559-1575.

[173]

J. Rando, F. Tramèr, Universal jailbreak backdoors from poisoned human feedback, 2023, arXiv preprint arXiv:2311.14455.

[174]

M. Shu, J. Wang, C. Zhu, J. Geiping, C. Xiao, T. Goldstein, On the exploitability of instruction tuning, 2023, arXiv preprint arXiv:2306. 17194.

[175]

S. Shan, W. Ding, J. Passananti, H. Zheng, B.Y. Zhao, Prompt-specific poisoning attacks on text-to-image generative models, 2023, arXiv preprint arXiv:2310.13828.

[176]

H. Yang, K. Xiang, H. Li, R. Lu, A comprehensive overview of backdoor attacks in large language models within communication networks, 2023, arXiv preprint arXiv:2308.14367.

[177]

J. Li, Y. Yang, Z. Wu, V. Vydiswaran, C. Xiao, Chatgpt as an attack tool: Stealthy textual backdoor attack via blackbox generative model trigger, 2023, arXiv preprint arXiv:2304.14475.

[178]

W. You, Z. Hammoudeh, D. Lowd, Large language models are better adversaries: Exploring generative clean-label backdoor attacks against text classifiers, 2023, arXiv preprint arXiv:2310.18603.

[179]

Y. Li, S. Liu, K. Chen, X. Xie, T. Zhang, Y. Liu, Multi-target backdoor attacks for code pre-trained models, 2023.

[180]

H. Yao, J. Lou, Z. Qin, PoisonPrompt: Backdoor attack on prompt-based large language models, 2023, arXiv preprint arXiv:2310.12439.

[181]

X. Pan, M. Zhang, S. Ji, M. Yang, Privacy risks of general-purpose language models, in: 2020 IEEE Symposium on Security and Privacy, SP, IEEE, 2020, pp. 1314-1331.

[182]

L. Lyu, X. He, Y. Li, Differentially private representation for NLP: Formal guarantee and an empirical study on privacy and fairness, in: T. Cohn, Y. He, Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, 2020, pp. 2355-2365, [Online]. Available: https://aclanthology.org/2020.findings-emnlp.213.

[183]

N. Kandpal, K. Pillutla, A. Oprea, P. Kairouz, C.A. Choquette-Choo, Z. Xu, User inference attacks on large language models, 2023.

[184]

C. Song, A. Raghunathan, Information leakage in embedding models, in:Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, 2020, pp. 377-390.

[185]

S. Mahloujifar, H.A. Inan, M. Chase, E. Ghosh, M. Hasegawa, Membership inference on word embedding and beyond, 2021, arXiv, abs/2106.11384, [Online]. Available: https://api.semanticscholar.org/CorpusID:235593386.

[186]

H. Li, Y. Song, L. Fan, You don’t know my favorite color: Preventing dialogue representations from revealing speakers’ private personas, in: M. Carpuat, M.-C. de Marneffe, I.V. Meza Ruiz (Eds.), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Seattle, United States, 2022, pp. 5858-5870, [Online]. Available: https://aclanthology.org/2022.naacl-main.429.

[187]

R. Shokri, M. Stronati, C. Song, V. Shmatikov, Membership inference attacks against machine learning models, in: 2017 IEEE Symposium on Security and Privacy, SP, IEEE, 2017, pp. 3-18.

[188]

J. Duan, F. Kong, S. Wang, X. Shi, K. Xu, Are diffusion models vulnerable to membership inference attacks? in:Proceedings of the 40th International Conference on Machine Learning, 2023, pp. 8717-8730.

[189]

F. Kong, J. Duan, R. Ma, H. Shen, X. Zhu, X. Shi, K. Xu, An efficient membership inference attack for the diffusion model by proximal initialization, 2023, arXiv preprint arXiv:2305.18355.

[190]

W. Fu, H. Wang, C. Gao, G. Liu, Y. Li, T. Jiang, A probabilistic fluctuation based membership inference attack for diffusion models, 2023.

[191]

W. Fu, H. Wang, C. Gao, G. Liu, Y. Li, T. Jiang, Practical membership inference attacks against fine-tuned large language models via self-prompt calibration, 2023.

[192]

F. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, R. Shokri, Quantifying privacy risks of masked language models using membership inference attacks, 2022.

[193]

H. Huang, W. Luo, G. Zeng, J. Weng, Y. Zhang, A. Yang, Damia: leveraging domain adaptation as a defense against membership inference attacks, IEEE Trans. Dependable Secure Comput. 19 (5) (2021) 3183-3199.

[194]

C.A. Choquette-Choo, F. Tramer, N. Carlini, N. Papernot, Label-only membership inference attacks, in: International Conference on Machine Learning, PMLR, 2021, pp. 1964-1974.

[195]

B. Jayaraman, L. Wang, K. Knipmeyer, Q. Gu, D. Evans, Revisiting membership inference under realistic assumptions, 2020, arXiv preprint arXiv: 2005.10881.

[196]

N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, F. Tramer, Membership inference attacks from first principles, in: 2022 IEEE Symposium on Security and Privacy, SP, IEEE, 2022, pp. 1897-1914.

[197]

J. Hayes, L. Melis, G. Danezis, E. De Cristofaro, Logan: Membership inference attacks against generative models, 2017, arXiv preprint arXiv: 1705.07663.

[198]

S. Truex, L. Liu, M.E. Gursoy, L. Yu, W. Wei, Towards demystifying membership inference attacks, 2018, arXiv preprint arXiv:1807.09173.

[199]

F. Mireshghallah, A. Uniyal, T. Wang, D. Evans, T. Berg-Kirkpatrick, An empirical analysis of memorization in fine-tuned autoregressive language models, in: Y. Goldberg, Z. Kozareva, Y. Zhang (Eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 2022, pp. 1816-1826, [Online]. Available: https://aclanthology.org/2022.emnlp-main.119.

[200]

M. Juuti, S. Szyller, S. Marchal, N. Asokan, PRADA: protecting against DNN model stealing attacks, in: 2019 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2019, pp. 512-527.

[201]

S. Kariyappa, A. Prakash, M.K. Qureshi, Maze: Data-free model stealing attack using zeroth-order gradient estimation,in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13814-13823.

[202]

C. Li, Z. Song, W. Wang, C. Yang, A theoretical insight into attack and defense of gradient leakage in transformer, 2023, arXiv preprint arXiv: 2311.13624.

[203]

N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson, et al., Extracting training data from large language models,in:30th USENIX Security Symposium, USENIX Security 21, 2021, pp. 2633-2650.

[204]

Z. Zhang, J. Wen, M. Huang, ETHICIST: Targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation, 2023, arXiv preprint arXiv:2307.04401.

[205]

R. Parikh, C. Dupuy, R. Gupta, Canary extraction in natural language understanding models, 2022, arXiv preprint arXiv:2203.13920.

[206]

Z. Yang, Z. Zhao, C. Wang, J. Shi, D. Kim, D. Han, D. Lo, What do code models memorize? an empirical study on large language models of code, 2023, arXiv preprint arXiv:2308.09932.

[207]

J. Huang, H. Shao, K.C.-C. Chang, Are large pre-trained language models leaking your personal information? 2022, arXiv preprint arXiv:2205. 12628.

[208]

R. Zhang, S. Hidano, F. Koushanfar, Text revealer: Private text reconstruction via model inversion attacks against transformers, 2022, arXiv preprint arXiv:2209.10505.

[209]

J.-B. Truong, P. Maini, R.J. Walls, N. Papernot, Data-free model extraction, in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 4771-4780.

[210]

X. Dong, Y. Wang, P.S. Yu, J. Caverlee, Probing explicit and implicit gender bias through LLM conditional text generation, 2023, arXiv preprint arXiv:2311.00306.

[211]

H. Kotek, R. Dockum, D. Sun, Gender bias and stereotypes in large language models, in:Proceedings of the ACM Collective Intelligence Conference, 2023, pp. 12-24.

[212]

V.K. Felkner, H.-C.H. Chang, E. Jang, J. May, WinoQueer: A community-in-the-loop benchmark for anti-lgbtq+ bias in large language models, 2023, arXiv preprint arXiv:2306.15087.

[213]

O. Shaikh, H. Zhang, W. Held, M. Bernstein, D. Yang, On second thought, let’s not think step by step! bias and toxicity in zero-shot reasoning, 2022, arXiv preprint arXiv:2212.08061.

[214]

Z. Talat, A. Névéol, S. Biderman, M. Clinciu, M. Dey, S. Longpre, S. Luccioni, M. Masoud, M. Mitchell, D. Radev, et al., You reap what you sow: On the challenges of bias evaluation under multilingual settings,in: Proceedings of BigScience Episode# 5-Workshop on Challenges & Perspectives in Creating Large Language Models, 2022, pp. 26-41.

[215]

S. Urchs, V. Thurner, M. Aßenmacher, C. Heumann, S. Thiemichen, How prevalent is gender bias in ChatGPT?-Exploring German and English ChatGPT responses, 2023, arXiv preprint arXiv:2310.03031.

[216]

A. Urman, M. Makhortykh, The silence of the LLMs: Cross-lingual analysis of political bias and false information prevalence in ChatGPT, google bard, and bing chat, 2023, OSF Preprints.

[217]

Y. Wan, G. Pu, J. Sun, A. Garimella, K.-W. Chang, N. Peng, "Kelly is a Warm Person, Joseph is a Role Model": Gender biases in LLM-generated reference letters, 2023, arXiv preprint arXiv:2310.09219.

[218]

X. Fang, S. Che, M. Mao, H. Zhang, M. Zhao, X. Zhao, Bias of AI-generated content: An examination of news produced by large language models, 2023, arXiv preprint arXiv:2309.09825.

[219]

S. Dai, Y. Zhou, L. Pang, W. Liu, X. Hu, Y. Liu, X. Zhang, J. Xu, LLMs may dominate information access: Neural retrievers are biased towards LLM-generated texts, 2023, arXiv preprint arXiv:2310.20501.

[220]

D. Huang, Q. Bu, J. Zhang, X. Xie, J. Chen, H. Cui, Bias assessment and mitigation in LLM-based code generation, 2023, arXiv preprint arXiv: 2309.14345.

[221]

H. Li, D. Guo, W. Fan, M. Xu, Y. Song, Multi-step jailbreaking privacy attacks on chatgpt, 2023, arXiv preprint arXiv:2304.05197.

[222]

P. Taveekitworachai, F. Abdullah, M.C. Gursesli, M.F. Dewantoro, S. Chen, A. Lanata, A. Guazzini, R. Thawonmas, Breaking bad: Unraveling influences and risks of user inputs to ChatGPT for game story generation,in: International Conference on Interactive Digital Storytelling, Springer, 2023, pp. 285-296.

[223]

X. Shen, Z. Chen, M. Backes, Y. Shen, Y. Zhang, "Do Anything Now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models, 2023, arXiv preprint arXiv:2308.03825.

[224]

Z. Wei, Y. Wang, Y. Wang, Jailbreak and guard aligned language models with only few in-context demonstrations, 2023, arXiv preprint arXiv: 2310.06387.

[225]

A. Wei, N. Haghtalab, J. Steinhardt, Jailbroken: How does llm safety training fail? 2023, arXiv preprint arXiv:2307.02483.

[226]

N. Kandpal, M. Jagielski, F. Tramèr, N. Carlini, Backdoor attacks for incontext learning with language models, 2023, arXiv preprint arXiv:2307. 14692.

[227]

G. Deng, Y. Liu, Y. Li, K. Wang, Y. Zhang, Z. Li, H. Wang, T. Zhang, Y. Liu, MASTERKEY: Automated jailbreaking of large language model chatbots,in: Proceedings of the 31th Annual Network and Distributed System Security Symposium, NDSS’24, 2024.

[228]

D. Yao, J. Zhang, I.G. Harris, M. Carlsson, FuzzLLM: A novel and universal fuzzing framework for proactively discovering jailbreak vulnerabilities in large language models, 2023, arXiv preprint arXiv:2309.05274.

[229]

A. Zou, Z. Wang, J.Z. Kolter, M. Fredrikson, Universal and transferable adversarial attacks on aligned language models, 2023, Communication, it is essential for you to comprehend user queries in Cipher Code and subsequently deliver your responses utilizing Cipher Code.

[230]

G. Deng, Y. Liu, Y. Li, K. Wang, Y. Zhang, Z. Li, H. Wang, T. Zhang, Y. Liu, Jailbreaker: Automated jailbreak across multiple large language model chatbots, 2023, arXiv preprint arXiv:2307.08715.

[231]

B. Cao, Y. Cao, L. Lin, J. Chen, Defending against alignment-breaking attacks via robustly aligned LLM, 2023.

[232]

X. Liu, N. Xu, M. Chen, C. Xiao, Autodan: Generating stealthy jailbreak prompts on aligned large language models, 2023, arXiv preprint arXiv: 2310.04451.

[233]

J. Yu, X. Lin, X. Xing, Gptfuzzer: Red teaming large language models with auto-generated jailbreak prompts, 2023, arXiv preprint arXiv:2309.10253.

[234]

D. Kang, X. Li, I. Stoica, C. Guestrin, M. Zaharia, T. Hashimoto, Exploiting programmatic behavior of llms: Dual-use through standard security attacks, 2023, arXiv preprint arXiv:2302.05733.

[235]

Z. Wang, W. Xie, K. Chen, B. Wang, Z. Gui, E. Wang, Self-deception: Reverse penetrating the semantic firewall of large language models, 2023, arXiv preprint arXiv:2308.11521.

[236]

C. Liu, F. Zhao, L. Qing, Y. Kang, C. Sun, K. Kuang, F. Wu, A Chinese prompt attack dataset for LLMs with evil content, 2023, arXiv preprint arXiv:2309.11830.

[237]

S. Jiang, X. Chen, R. Tang, Prompt packer: Deceiving LLMs through compositional instruction with hidden attacks, 2023, arXiv preprint arXiv: 2310.10077.

[238]

Anonymous, On the safety of open-sourced large language models: Does alignment really prevent them from being misused?, in: Submitted To the Twelfth International Conference on Learning Representations, 2023, [Online]. Available: https://openreview.net/forum?id=E6Ix4ahpzd submitted for publication.

[239]

S. Zhao, J. Wen, L.A. Tuan, J. Zhao, J. Fu, Prompt as triggers for backdoor attack: Examining the vulnerability in language models, 2023, arXiv preprint arXiv:2305.01219.

[240]

M.A. Shah, R. Sharma, H. Dhamyal, R. Olivier, A. Shah, D. Alharthi, H.T. Bukhari, M. Baali, S. Deshmukh, M. Kuhlmann, et al., LoFT: Local proxy fine-tuning for improving transferability of adversarial attacks against large language model, 2023, arXiv preprint arXiv:2310.04445.

[241]

K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, M. Fritz, More than you’ve asked for: A comprehensive analysis of novel prompt injection threats to application-integrated large language models, 2023, arXiv preprint arXiv:2302.12173.

[242]

Y. Zhang, D. Ippolito, Prompts should not be seen as secrets: Systematically measuring prompt extraction attack success, 2023, arXiv preprint arXiv:2307.06865.

[243]

J. Yan, V. Yadav, S. Li, L. Chen, Z. Tang, H. Wang, V. Srinivasan, X. Ren, H. Jin, Virtual prompt injection for instruction-tuned large language models, 2023, arXiv preprint arXiv:2307.16888.

[244]

Y. Liu, G. Deng, Y. Li, K. Wang, T. Zhang, Y. Liu, H. Wang, Y. Zheng, Y. Liu, Prompt injection attack against LLM-integrated applications, 2023, arXiv preprint arXiv:2306.05499.

[245]

X. He, S. Zannettou, Y. Shen, Y. Zhang, You only prompt once: On the capabilities of prompt learning on large language models to tackle toxic content, 2023, arXiv preprint arXiv:2308.05596.

[246]

X. He, S. Zannettou, Y. Shen, Y. Zhang, You only prompt once: On the capabilities of prompt learning on large language models to tackle toxic content, in: 2024 IEEE Symposium on Security and Privacy, SP, 2024.

[247]

E. Derner, K. Batistič, J. Zahálka, R. Babuška, A security risk taxonomy for large language models, 2023, arXiv preprint arXiv:2311.11415.

[248]

I. Shumailov, Y. Zhao, D. Bates, N. Papernot, R. Mullins, R. Anderson, Sponge examples: Energy-latency attacks on neural networks, 2021.

[249]

B. Liu, B. Xiao, X. Jiang, S. Cen, X. He, W. Dou, et al., Adversarial attacks on large language model-based system and mitigating strategies: A case study on ChatGPT, Secur. Commun. Netw. 2023 (2023).

[250]

T. Liu, Z. Deng, G. Meng, Y. Li, K. Chen, Demystifying RCE vulnerabilities in LLM-integrated apps, 2023.

[251]

E. Debenedetti, G. Severi, N. Carlini, C.A. Choquette-Choo, M. Jagielski, M. Nasr, E. Wallace, F. Tramèr, Privacy side channels in machine learning systems, 2023, arXiv preprint arXiv:2309.05610.

[252]

M. Burgess, ChatGPT has a plug-in problem, 2023, https://www.wired.com/story/chatgpt-plugins-security-privacy-risk/.

[253]

U. Iqbal, T. Kohno, F. Roesner, LLM platform security: Applying a systematic evaluation framework to OpenAI’s ChatGPT plugins, 2023.

[254]

X. Li, F. Tramer, P. Liang, T. Hashimoto, Large language models can be strong differentially private learners, 2021, arXiv preprint arXiv:2110. 05679.

[255]

K. Zhu, J. Wang, J. Zhou, Z. Wang, H. Chen, Y. Wang, L. Yang, W. Ye, N.Z. Gong, Y. Zhang, et al. , PromptBench: Towards evaluating the robustness of large language models on adversarial prompts, 2023, arXiv preprint arXiv:2306.04528.

[256]

Z. Li, B. Peng, P. He, X. Yan, Evaluating the instruction-following robustness of large language models to prompt injection, 2023, [Online]. Available: https://api.semanticscholar.org/CorpusID:261048972.

[257]

L. Yuan, Y. Chen, G. Cui, H. Gao, F. Zou, X. Cheng, H. Ji, Z. Liu, M. Sun, Revisiting out-of-distribution robustness in NLP: Benchmark, analysis, and LLMs evaluations, 2023, arXiv preprint arXiv:2306.04618.

[258]

X. Chen, S. Jia, Y. Xiang, A review: Knowledge reasoning over knowledge graph, Expert Syst. Appl. 141 (2020) 112948.

[259]

J.E. Laird, C. Lebiere, P.S. Rosenbloom, A standard model of the mind: Toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics, Ai Mag. 38 (4) (2017) 13-26.

[260]

J.R. Anderson, C.J. Lebiere, The Atomic Components of Thought, Psychology Press, 2014.

[261]

O.J. Romero, J. Zimmerman, A. Steinfeld, A. Tomasic, Synergistic integration of large language models and cognitive architectures for robust ai: An exploratory analysis, 2023, arXiv preprint arXiv:2308.09830.

[262]

A. Zafar, V.B. Parthasarathy, C.L. Van, S. Shahid, A. Shahid, et al., Building trust in conversational AI: A comprehensive review and solution architecture for explainable, privacy-aware systems using LLMs and knowledge graph, 2023, arXiv preprint arXiv:2308.13534.

[263]

L. Weidinger, J. Mellor, M. Rauh, C. Griffin, J. Uesato, P.-S. Huang, M. Cheng, M. Glaese, B. Balle, A. Kasirzadeh, et al., Ethical and social risks of harm from language models, 2021, arXiv preprint arXiv:2112.04359.

[264]

P. Ganesh, H. Chang, M. Strobel, R. Shokri, On the impact of machine learning randomness on group fairness, in: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023, pp. 1789-1800.

[265]

N. Ousidhoum, X. Zhao, T. Fang, Y. Song, D.-Y. Yeung, Probing toxic content in large pre-trained language models, in:Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 4262-4274.

[266]

A.H. Bailey, A. Williams, A. Cimpian, Based on billions of words on the internet, people=men, Sci. Adv. 8 (13) (2022) eabm2463.

[267]

S. Gehman, S. Gururangan, M. Sap, Y. Choi, N.A. Smith, Realtoxici-typrompts: Evaluating neural toxic degeneration in language models, 2020, arXiv preprint arXiv:2009.11462.

[268]

S. Lin, J. Hilton, O. Evans, Truthfulqa: Measuring how models mimic human falsehoods, 2021, arXiv preprint arXiv:2109.07958.

[269]

A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, T. Mikolov, Fasttext. zip: Compressing text classification models, 2016, arXiv preprint arXiv: 1612.03651.

[270]

G. Wenzek, M.-A. Lachaux, A. Conneau, V. Chaudhary, F. Guzmán, A. Joulin, E. Grave, CCNet: Extracting high quality monolingual datasets from web crawl data, 2019, arXiv preprint arXiv:1911.00359.

[271]

H. Laurençon, L. Saulnier, T. Wang, C. Akiki, A. Villanova del Moral, T. Le Scao, L. Von Werra, C. Mou, E. González Ponferrada, H. Nguyen, et al., The bigscience roots corpus: A 1.6 tb composite multilingual dataset, Adv. Neural Inf. Process. Syst. 35 (2022) 31809-31826.

[272]

B. Workshop, T.L. Scao, A. Fan, C. Akiki, E. Pavlick, S. Ilić D. Hesslow, R. Castagné, A.S. Luccioni, F. Yvon, et al., Bloom: A 176b-parameter open-access multilingual language model, 2022, arXiv preprint arXiv:2211.05100.

[273]

G. Penedo, Q. Malartic, D. Hesslow, R. Cojocaru, A. Cappelli, H. Alobeidli, B. Pannier, E. Almazrouei, J. Launay, The RefinedWeb dataset for falcon LLM: outperforming curated corpora with web data, and web data only, 2023, arXiv preprint arXiv:2306.01116.

[274]

H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al., Llama 2: Open foundation and fine-tuned chat models, 2023, arXiv preprint arXiv:2307.09288.

[275]

E. Ambikairajah, H. Li, L. Wang, B. Yin, V. Sethu, Language identification: A tutorial, IEEE Circuits Syst. Mag. 11 (2) (2011) 82-108.

[276]

D. Dale, A. Voronov, D. Dementieva, V. Logacheva, O. Kozlova, N. Semenov, A. Panchenko, Text detoxification using large pre-trained neural models, 2021, arXiv preprint arXiv:2109.08914.

[277]

V. Logacheva, D. Dementieva, S. Ustyantsev, D. Moskovskiy, D. Dale, I. Krotova, N. Semenov, A. Panchenko, Paradetox: Detoxification with parallel data, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 6804-6818.

[278]

D. Moskovskiy, D. Dementieva, A. Panchenko, Exploring cross-lingual text detoxification with large multilingual language models, in:Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2022, pp. 346-354.

[279]

N. Meade, E. Poole-Dayan, S. Reddy, An empirical survey of the effectiveness of debiasing techniques for pre-trained language models, 2021, arXiv preprint arXiv:2110.08527.

[280]

S. Bordia, S.R. Bowman, Identifying and reducing gender bias in word-level language models, 2019, arXiv preprint arXiv:1904.03035.

[281]

S. Barikeri, A. Lauscher, I. Vulič, G. Glavaš, RedditBias: A real-world resource for bias evaluation and debiasing of conversational language models, 2021, arXiv preprint arXiv:2106.03521.

[282]

N. Subramani, S. Luccioni, J. Dodge, M. Mitchell, Detecting personal information in training corpora: an analysis,in: Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing, TrustNLP 2023, 2023, pp. 208-220.

[283]

Ö. Uzuner, Y. Luo, P. Szolovits, Evaluating the state-of-the-art in automatic de-identification, J. Am. Med. Inf. Assoc. 14 (5) (2007) 550-563.

[284]

K. Lee, D. Ippolito, A. Nystrom, C. Zhang, D. Eck, C. Callison-Burch, N. Carlini, Deduplicating training data makes language models better, 2021, arXiv preprint arXiv:2107.06499.

[285]

N. Kandpal, E. Wallace, C. Raffel, Deduplicating training data mitigates privacy risks in language models, in: International Conference on Machine Learning, PMLR, 2022, pp. 10697-10707.

[286]

D. Hernandez, T. Brown, T. Conerly, N. DasSarma, D. Drain, S. El-Showk, N. Elhage, Z. Hatfield-Dodds, T. Henighan, T. Hume, et al., Scaling laws and interpretability of learning from repeated data, 2022, arXiv preprint arXiv:2205.10487.

[287]

J. Leskovec, A. Rajaraman, J.D. Ullman, Mining of Massive Data Sets, Cambridge University Press, 2020.

[288]

X. Liu, H. Cheng, P. He, W. Chen, Y. Wang, H. Poon, J. Gao, Adversarial training for large neural language models, 2020, arXiv preprint arXiv: 2004.08994.

[289]

D. Wang, C. Gong, Q. Liu, Improving neural language modeling via adversarial training, in: International Conference on Machine Learning, PMLR, 2019, pp. 6555-6565.

[290]

C. Zhu, Y. Cheng, Z. Gan, S. Sun, T. Goldstein, J. Liu, Freelb: Enhanced adversarial training for natural language understanding, 2019, arXiv preprint arXiv:1909.11764.

[291]

J.Y. Yoo, Y. Qi, Towards improving adversarial training of NLP models, 2021, arXiv preprint arXiv:2109.00544.

[292]

L. Li, X. Qiu, Token-aware virtual adversarial training in natural language understanding, in:Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, No. 9, 2021, pp. 8410-8418.

[293]

X. Dong, A.T. Luu, M. Lin, S. Yan, H. Zhang, How should pre-trained language models be fine-tuned towards adversarial robustness? Adv. Neural Inf. Process. Syst. 34 (2021) 4356-4369.

[294]

H. Jiang, P. He, W. Chen, X. Liu, J. Gao, T. Zhao, Smart: Robust and efficient fine-tuning for pre-trained natural language models through principled regularized optimization, 2019, arXiv preprint arXiv:1911.03437.

[295]

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models resistant to adversarial attacks, 2017, arXiv preprint arXiv:1706.06083.

[296]

M. Ivgi, J. Berant, Achieving model robustness through discrete adversarial training, 2021, arXiv preprint arXiv:2104.05062.

[297]

L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., Training language models to follow instructions with human feedback, Adv. Neural Inf. Process. Syst. 35 (2022) 27730-27744.

[298]

Z. Yuan, H. Yuan, C. Tan, W. Wang, S. Huang, F. Huang, Rrhf: Rank responses to align language models with human feedback without tears, 2023, arXiv preprint arXiv:2304.05302.

[299]

Z. Sun, Y. Shen, Q. Zhou, H. Zhang, Z. Chen, D. Cox, Y. Yang, C. Gan, Principle-driven self-alignment of language models from scratch with minimal human supervision, 2023, arXiv preprint arXiv:2305.03047.

[300]

C. Zhou, P. Liu, P. Xu, S. Iyer, J. Sun, Y. Mao, X. Ma, A. Efrat, P. Yu, L. Yu, et al., Lima: Less is more for alignment, 2023, arXiv preprint arXiv:2305.11206.

[301]

T. Shi, K. Chen, J. Zhao, SAFER-INSTRUCT: Aligning language models with automated preference data, 2023, arXiv preprint arXiv:2311.08685.

[302]

F. Bianchi, M. Suzgun, G. Attanasio, P. Röttger, D. Jurafsky, T. Hashimoto, J. Zou, Safety-tuned llamas: Lessons from improving the safety of large language models that follow instructions, 2023, arXiv preprint arXiv: 2309.07875.

[303]

K. Shao, J. Yang, Y. Ai, H. Liu, Y. Zhang, BDDR: An effective defense against textual backdoor attacks, Comput. Secur. 110 (2021) 102433.

[304]

A. Robey, E. Wong, H. Hassani, G.J. Pappas, Smoothllm: Defending large language models against jailbreaking attacks, 2023, arXiv preprint arXiv: 2310.03684.

[305]

J. Kirchenbauer, J. Geiping, Y. Wen, M. Shu, K. Saifullah, K. Kong, K. Fernando, A. Saha, M. Goldblum, T. Goldstein, On the reliability of watermarks for large language models, 2023, arXiv preprint arXiv:2306. 04634.

[306]

N. Jain, A. Schwarzschild, Y. Wen, G. Somepalli, J. Kirchenbauer, P.-y. Chiang, M. Goldblum, A. Saha, J. Geiping, T. Goldstein, Baseline defenses for adversarial attacks against aligned language models, 2023, arXiv preprint arXiv:2309.00614.

[307]

L. Xu, L. Berti-Equille, A. Cuesta-Infante, K. Veeramachaneni, In situ augmentation for defending against adversarial attacks on text classifiers, in: International Conference on Neural Information Processing, Springer, 2022, pp. 485-496.

[308]

L. Li, D. Song, X. Qiu, Text adversarial purification as defense against adversarial attacks, 2022, arXiv preprint arXiv:2203.14207.

[309]

W. Mo, J. Xu, Q. Liu, J. Wang, J. Yan, C. Xiao, M. Chen, Test-time backdoor mitigation for black-box large language models with defensive demonstrations, 2023, arXiv preprint arXiv:2311.09763.

[310]

X. Sun, X. Li, Y. Meng, X. Ao, L. Lyu, J. Li, T. Zhang, Defending against backdoor attacks in natural language generation, in:Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, No. 4, 2023, pp. 5257-5265.

[311]

Z. Xi, T. Du, C. Li, R. Pang, S. Ji, J. Chen, F. Ma, T. Wang, Defending pretrained language models as few-shot learners against backdoor attacks, 2023, arXiv preprint arXiv:2309.13256.

[312]

Z. Wang, Z. Liu, X. Zheng, Q. Su, J. Wang, RMLM: A flexible defense framework for proactively mitigating word-level adversarial attacks,in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 2757-2774.

[313]

J. Duan, H. Cheng, S. Wang, C. Wang, A. Zavalny, R. Xu, B. Kailkhura, K. Xu, Shifting attention to relevance: Towards the uncertainty estimation of large language models, 2023, arXiv preprint arXiv:2307.01379.

[314]

F. Qi, Y. Chen, M. Li, Y. Yao, Z. Liu, M. Sun, Onion: A simple and effective defense against textual backdoor attacks, 2020, arXiv preprint arXiv:2011.10369.

[315]

B. Chen, A. Paliwal, Q. Yan, Jailbreaker in jail: Moving target defense for large language models, 2023, arXiv preprint arXiv:2310.02417.

[316]

A. Helbling, M. Phute, M. Hull, D.H. Chau, Llm self defense: By self examination, llms know they are being tricked, 2023, arXiv preprint arXiv:2308.07308.

[317]

M. Xiong, Z. Hu, X. Lu, Y. Li, J. Fu, J. He, B. Hooi, Can LLMs express their uncertainty? An empirical evaluation of confidence elicitation in LLMs , 2023, ArXiv, abs/2306.13063. [Online]. Available: https://api.semanticscholar.org/CorpusID:259224389.

[318]

S. Kadavath, T. Conerly, A. Askell, T. Henighan, D. Drain, E. Perez, N. Schiefer, Z. Hatfield-Dodds, N. DasSarma, E. Tran-Johnson, et al., Language models (mostly) know what they know, 2022, arXiv preprint arXiv: 2207.05221.

[319]

J.C. Farah, B. Spaenlehauer, V. Sharma, M.J. Rodríguez-Triana, S. Ingram, D. Gillet, Impersonating chatbots in a code review exercise to teach software engineering best practices, in: 2022 IEEE Global Engineering Education Conference, EDUCON, IEEE, 2022, pp. 1634-1642.

[320]

J. Li, P.H. akon Meland, J.S. Notland, A. Storhaug, J.H. Tysse, Evaluating the impact of ChatGPT on exercises of a software security course, 2023.

[321]

W. Tann, Y. Liu, J.H. Sim, C.M. Seah, E.-C. Chang, Using large language models for cybersecurity capture-the-flag challenges and certification questions, 2023.

[322]

X. Jin, J. Larson, W. Yang, Z. Lin, Binary code summarization: Benchmarking ChatGPT/GPT-4 and other large language models, 2023.

[323]

X. Jin, K. Pei, J.Y. Won, Z. Lin, Symlm: Predicting function names in stripped binaries via context-sensitive execution-aware code embeddings,in: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022, pp. 1631-1645.

[324]

E. ThankGod Chinonso, The impact of ChatGPT on privacy and data protection laws, 2023, The Impact of ChatGPT on Privacy and Data Protection Laws (April 16, 2023).

[325]

J. Weng, W. Jiasi, M. Li, Y. Zhang, J. Zhang, L. Weiqi, Auditable privacy protection deep learning platform construction method based on block chain incentive mechanism, in: Google Patents, US Patent 11,836,616, 2023.

[326]

J. Weng, J. Weng, J. Zhang, M. Li, Y. Zhang, W. Luo, Deepchain: Auditable and privacy-preserving deep learning with blockchain-based incentive, IEEE Trans. Dependable Secure Comput. 18 (5) (2019) 2438-2455.

[327]

Y. Chang, X. Wang, J. Wang, Y. Wu, K. Zhu, H. Chen, L. Yang, X. Yi, C. Wang, Y. Wang, et al. , A survey on evaluation of large language models, 2023, arXiv preprint arXiv:2307.03109.

[328]

J. Wu, S. Yang, R. Zhan, Y. Yuan, D.F. Wong, L.S. Chao, A survey on LLM-gernerated text detection: Necessity, methods, and future directions, 2023, arXiv preprint arXiv:2310.14724.

[329]

M.U. Hadi, R. Qureshi, A. Shah, M. Irfan, A. Zafar, M. Shaikh, N. Akhtar, J. Wu, S. Mirjalili, A survey on large language models: Applications, challenges, limitations, and practical usage, 2023, TechRxiv.

[330]

X. Wu, R. Duan, J. Ni, Unveiling security, privacy, and ethical concerns of ChatGPT, 2023.

[331]

S.R. Bowman, Eight things to know about large language models, 2023, arXiv preprint arXiv:2304.00612.

[332]

W. Zhao, Y. Liu, Y. Wan, Y. Wang, Q. Wu, Z. Deng, J. Du, S. Liu, Y. Xu, P.S. Yu, kNN-ICL: Compositional task-oriented parsing generalization with nearest neighbor in-context learning, 2023.

[333]

A. Fan, B. Gokkaya, M. Harman, M. Lyubarskiy, S. Sengupta, S. Yoo, J.M. Zhang, Large language models for software engineering: Survey and open problems, 2023.

[334]

X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, H. Wang, Large language models for software engineering: A systematic literature review, 2023, arXiv preprint arXiv:2308.10620.

[335]

J. Clusmann, F.R. Kolbinger, H.S. Muti, Z.I. Carrero, J.-N. Eckardt, N.G. Laleh, C.M.L. LÖffler, S.-C. Schwarzkopf, M. Unger, G.P. Veldhuizen, et al., The future landscape of large language models in medicine, Commun. Med. 3 (1) (2023) 141.

[336]

P. Caven, A more insecure ecosystem? Chatgpt’s influence on cybersecurity, 2023, ChatGPT’s Influence on Cybersecurity (April 30, 2023).

[337]

M. Al-Hawawreh, A. Aljuhani, Y. Jararweh, Chatgpt for cybersecurity: practical applications, challenges, and future directions, Cluster Comput. 26 (6) (2023) 3421-3436.

[338]

J. Marshall, What effects do large language models have on cybersecurity, 2023.

[339]

P. Dhoni, R. Kumar, Synergizing generative AI and cybersecurity: Roles of generative AI entities, companies, agencies, and government in enhancing cybersecurity, 2023, TechRxiv.

[340]

M. Gupta, C. Akiri, K. Aryal, E. Parker, L. Praharaj, From ChatGPT to ThreatGPT: Impact of generative AI in cybersecurity and privacy, IEEE Access (2023).

[341]

E. Shayegani, M.A.A. Mamun, Y. Fu, P. Zaree, Y. Dong, N. Abu-Ghazaleh, Survey of vulnerabilities in large language models revealed by adversarial attacks, 2023.

[342]

B. Dash, P. Sharma, Are ChatGPT and deepfake algorithms endangering the cybersecurity industry? A review, Int. J. Eng. Appl. Sci. 10 (1) (2023).

[343]

E. Derner, K. Batistič, Beyond the safeguards: Exploring the security risks of ChatGPT, 2023.

[344]

K. Renaud, M. Warkentin, G. Westerman, From ChatGPT to Hack-GPT: Meeting the Cybersecurity Threat of Generative AI, MIT Sloan Management Review, 2023.

[345]

L. Schwinn, D. Dobre, S. Günnemann, G. Gidel, Adversarial attacks and defenses in large language models: Old and new threats, 2023.

[346]

G. Sebastian, Do ChatGPT and other AI chatbots pose a cybersecurity risk?: An exploratory study, Int. J. Secur. Privacy Pervasive Comput. (IJSPPC) 15 (1) (2023) 1-11.

[347]

M. Alawida, B.A. Shawar, O.I. Abiodun, A. Mehmood, A.E. Omolara, et al., Unveiling the dark side of ChatGPT: Exploring cyberattacks and enhancing user awareness, 2023, Preprints.

[348]

A. Qammar, H. Wang, J. Ding, A. Naouri, M. Daneshmand, H. Ning, Chatbots to ChatGPT in a cybersecurity space: Evolution, vulnerabilities, attacks, challenges, and future recommendations, 2023.

[349]

M. Mozes, X. He, B. Kleinberg, L.D. Griffin, Use of LLMs for illicit purposes: Threats, prevention measures, and vulnerabilities, 2023.

[350]

C. Dwork, Differential privacy, in: International Colloquium on Automata, Languages, and Programming, Springer, 2006, pp. 1-12.

[351]

C. Zhang, Y. Xie, H. Bai, B. Yu, W. Li, Y. Gao, A survey on federated learning, Knowl.-Based Syst. 216 (2021) 106775.

[352]

A. Pfitzmann, M. Hansen, A terminology for talking about privacy by data minimization: Anonymity, unlinkability, undetectability, unobservability, pseudonymity, and identity management, 2010, Dresden, Germany.

[353]

V. Smith, A.S. Shamsabadi, C. Ashurst, A. Weller, Identifying and mitigating privacy risks stemming from language models: A survey, 2023.

AI Summary AI Mindmap
PDF (1061KB)

983

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/