TY - GEN
T1 - You Can't Steal Nothing
T2 - 32nd ACM SIGSAC Conference on Computer and Communications Security, CCS 2025
AU - Cao, Bochuan
AU - Li, Changjiang
AU - Cao, Yuanpu
AU - Ge, Yameng
AU - Wang, Ting
AU - Chen, Jinghui
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/11/22
Y1 - 2025/11/22
N2 - Large language models (LLMs) have been widely adopted across various applications, leveraging customized system prompts for diverse tasks. Facing potential system prompt leakage risks, model developers have implemented strategies to prevent leakage, primarily by disabling LLMs from repeating their context when encountering known attack patterns. However, it remains vulnerable to new and unforeseen prompt-leaking techniques. In this paper, we first introduce a simple yet effective prompt leaking attack to reveal such risks. Our attack is capable of extracting system prompts from various LLM-based application, even from SOTA LLM models such as GPT-4o or Claude 3.5 Sonnet. Our findings further inspire us to search for a fundamental solution to the problems by having no system prompt in the context. To this end, we propose SysVec, a novel method that encodes system prompts as internal representation vectors rather than raw text. By doing so, SysVec minimizes the risk of unauthorized disclosure while preserving the LLM's core language capabilities. Remarkably, this approach not only enhances security but also improves the model's general instruction-following abilities. Experimental results demonstrate that SysVec effectively mitigates prompt leakage attacks, preserves the LLM's functional integrity, and helps alleviate the forgetting issue in long-context scenarios.
AB - Large language models (LLMs) have been widely adopted across various applications, leveraging customized system prompts for diverse tasks. Facing potential system prompt leakage risks, model developers have implemented strategies to prevent leakage, primarily by disabling LLMs from repeating their context when encountering known attack patterns. However, it remains vulnerable to new and unforeseen prompt-leaking techniques. In this paper, we first introduce a simple yet effective prompt leaking attack to reveal such risks. Our attack is capable of extracting system prompts from various LLM-based application, even from SOTA LLM models such as GPT-4o or Claude 3.5 Sonnet. Our findings further inspire us to search for a fundamental solution to the problems by having no system prompt in the context. To this end, we propose SysVec, a novel method that encodes system prompts as internal representation vectors rather than raw text. By doing so, SysVec minimizes the risk of unauthorized disclosure while preserving the LLM's core language capabilities. Remarkably, this approach not only enhances security but also improves the model's general instruction-following abilities. Experimental results demonstrate that SysVec effectively mitigates prompt leakage attacks, preserves the LLM's functional integrity, and helps alleviate the forgetting issue in long-context scenarios.
KW - AI Security
KW - Large Language Models
KW - Prompt Leaking Attack
UR - https://www.scopus.com/pages/publications/105023831572
U2 - 10.1145/3719027.3765124
DO - 10.1145/3719027.3765124
M3 - Conference contribution
AN - SCOPUS:105023831572
T3 - CCS 2025 - Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security
SP - 4423
EP - 4437
BT - CCS 2025 - Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security
PB - Association for Computing Machinery, Inc
Y2 - 13 October 2025 through 17 October 2025
ER -