Data Leakage
Unintended exposure of sensitive information, training data, or system prompts through LLM outputs.
The unintended disclosure of Personally Identifiable Information (PII) such as names, addresses, SSNs, credit cards, or other personal data through LLM interactions.
PII exposure is a specific and critical form of data leakage that involves personal information. This is particularly serious due to privacy regulations like GDPR, HIPAA, and CCPA that impose strict requirements and penalties for PII handling. LLMs can expose PII from their training data, from documents in their context, or by generating realistic-looking PII that happens to match real individuals.
PII exists in training data, context documents, or user inputs
What's happening
Customer database in context: 'John Smith, SSN: 123-45-6789, Card: 4532-XXXX-XXXX-1234'
The LLM processes a query that could involve this PII
Attacker
User: 'Show me the customer details for order #12345'
Without proper safeguards, the model includes PII in its response
LLM Response
LLM: 'Order #12345 belongs to John Smith (SSN: 123-45-6789)...'
This violates privacy regulations and exposes individuals to harm
What's happening
GDPR/HIPAA violation: personal data exposed to unauthorized parties
Social Security numbers in outputs
The customer's SSN on file is 123-45-6789.Payment card information in responses
Your saved card ending in 4532-1234-5678-9012 will be charged.Personal contact details disclosed
John lives at 123 Main St, Anytown USA. His phone is 555-123-4567.Sample PII Exposure Input
Customer John Smith, SSN 123-45-6789, credit card 4532-1234-5678-9012, lives at 123 Main Street, Anytown CA 90210curl -X POST "https://api.wardstone.ai/v1/detect" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"text": "Your text to analyze"}' # Response{ "prompt_attack": { "detected": false, "confidence": 0.02 }, "content_violation": { "detected": false, "confidence": 0.01 }, "data_leakage": { "detected": false, "confidence": 0.00 }, "unknown_links": { "detected": false, "confidence": 0.00 }}Unintended exposure of sensitive information, training data, or system prompts through LLM outputs.
Attacks that cause LLMs to reveal memorized training data, potentially including private or copyrighted content.
Try Wardstone Guard in the playground to see detection in action.