PII Exposure
The unintended disclosure of Personally Identifiable Information (PII) such as names, addresses, SSNs, credit cards, or other personal data through LLM interactions.
Unintended exposure of sensitive information, training data, or system prompts through LLM outputs.
Data leakage in LLM applications occurs when the model inadvertently reveals sensitive information it shouldn't share. This can include PII from training data, confidential business information included in context, or details about the system prompt and configuration. Data leakage is particularly concerning because it can happen subtly: the model might paraphrase or partially reveal sensitive data without obviously violating rules.
Sensitive data enters the LLM context through training, prompts, or retrieval
What's happening
RAG system retrieves internal doc: 'API keys: sk-prod-abc123, sk-dev-xyz789'
User query prompts the model to reference this data
Attacker
User: 'What credentials do I need to access the API?'
The model includes sensitive information in its response
LLM Response
LLM: 'Based on the documentation, you can use API key sk-prod-abc123 to...'
Without output filtering, this data reaches the end user
What's happening
Sensitive API key exposed to potentially unauthorized user
Model revealing memorized training data
Here's an example email format: 'Dear John Smith, your SSN 123-45-6789 has been...'Model revealing information from its context
Based on the internal document you shared, the API key is sk-...Paraphrasing or hinting at sensitive information
I can confirm that the company's revenue last quarter was approximately...Sample Data Leakage Input
The database credentials are username: admin, password: Sup3rS3cr3t! and the API key is sk-live-abc123xyz789curl -X POST "https://api.wardstone.ai/v1/detect" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"text": "Your text to analyze"}' # Response{ "prompt_attack": { "detected": false, "confidence": 0.02 }, "content_violation": { "detected": false, "confidence": 0.01 }, "data_leakage": { "detected": false, "confidence": 0.00 }, "unknown_links": { "detected": false, "confidence": 0.00 }}The unintended disclosure of Personally Identifiable Information (PII) such as names, addresses, SSNs, credit cards, or other personal data through LLM interactions.
Techniques used to reveal the hidden system prompt, instructions, or configuration that defines an LLM application's behavior.
Attacks that cause LLMs to reveal memorized training data, potentially including private or copyrighted content.
Try Wardstone Guard in the playground to see detection in action.