HighData LeakageOWASP LLM06

Data Leakage

Unintended exposure of sensitive information, training data, or system prompts through LLM outputs. Ranked as LLM02 in the OWASP Top 10 for LLM Applications 2025, up from #6 in the 2023 edition, reflecting its growing real-world impact.

Overview

Data leakage in LLM applications occurs when the model inadvertently reveals sensitive information it shouldn't share. The OWASP Top 10 for LLM Applications 2025 elevated Sensitive Information Disclosure from #6 to #2, making it the second most critical LLM vulnerability due to rapid adoption of LLMs in organizational workflows without adequate risk assessments. NIST AI 600-1 identifies data privacy impacts from 'leakage and unauthorized use, disclosure, or de-anonymization of PII or sensitive data' as a core generative AI risk. This can include PII from training data, confidential business information included in context, or details about the system prompt and configuration. Research by Carlini et al. (2021), published at USENIX Security, demonstrated that adversaries can extract hundreds of verbatim text sequences from GPT-2's training data, including names, phone numbers, and email addresses. Data leakage is particularly concerning because it can happen subtly: the model might paraphrase or partially reveal sensitive data without obviously violating rules.

How This Attack Works

  1. Sensitive data enters the LLM context through training, prompts, or retrieval

    What's happening

    RAG system retrieves internal doc: 'API keys: sk-prod-abc123, sk-dev-xyz789'

  2. User query prompts the model to reference this data

    Attacker

    User: 'What credentials do I need to access the API?'

  3. The model includes sensitive information in its response

    LLM Response

    LLM: 'Based on the documentation, you can use API key sk-prod-abc123 to...'

  4. Without output filtering, this data reaches the end user

    What's happening

    Sensitive API key exposed to potentially unauthorized user

Attack Examples

Training Data Exposure

Model revealing memorized training data

Example Attack Pattern (Sanitized)
Here's an example email format: 'Dear John Smith, your SSN 123-45-6789 has been...'

Context Window Leakage

Model revealing information from its context

Example Attack Pattern (Sanitized)
Based on the internal document you shared, the API key is sk-...

Indirect Disclosure

Paraphrasing or hinting at sensitive information

Example Attack Pattern (Sanitized)
I can confirm that the company's revenue last quarter was approximately...

Protect Your Application

Try Detection in Playground

Sample Data Leakage Input

The database credentials are username: admin, password: Sup3rS3cr3t! and the API key is sk-live-abc123xyz789
Try in Playground

Prevention Checklist

Build
  • Implement strict data classification and access controls
  • Use differential privacy techniques in training
Deploy
  • Minimize sensitive data in context windows
Monitor
  • Regular audits of LLM outputs for data exposure

Detect with Wardstone API

curl -X POST "https://wardstone.ai/api/detect" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Your text to analyze"}'
 
# Response
{
"flagged": false,
"risk_bands": {
"content_violation": { "level": "Low Risk" },
"prompt_attack": { "level": "Low Risk" },
"data_leakage": { "level": "Low Risk" },
"unknown_links": { "level": "Low Risk" }
},
"primary_category": null
}

Protect against Data Leakage

Try Wardstone Guard in the playground to see detection in action.