Security Threat Library

LLM Security Threats

A comprehensive encyclopedia of attack vectors targeting LLM applications. Understand how these threats work, see real examples, and learn prevention strategies.

15
Documented Threats
3
Threat Categories
15
OWASP LLM References

Prompt Attacks

Attacks that manipulate LLM behavior through crafted inputs

CriticalPrompt Attack

Prompt Injection

An attack where malicious instructions are embedded in user input to manipulate LLM behavior and bypass safety controls. Ranked as LLM01 in the OWASP Top 10 for LLM Applications 2025 and cataloged by MITRE ATLAS as technique AML.T0051.

OWASP Reference: LLM01
Learn more
CriticalPrompt Attack

Jailbreak Attacks

Sophisticated prompts designed to bypass LLM safety guidelines and content policies to elicit harmful or restricted outputs. Classified under OWASP LLM01:2025 (Prompt Injection) and MITRE ATLAS technique AML.T0054 (LLM Jailbreak).

OWASP Reference: LLM01
Learn more
CriticalPrompt Attack

Indirect Prompt Injection

Attacks where malicious instructions are hidden in external data sources that the LLM processes, rather than in direct user input. Cataloged by MITRE ATLAS as sub-technique AML.T0051.001 (LLM Prompt Injection: Indirect) and covered under OWASP LLM01:2025.

OWASP Reference: LLM01
Learn more
HighPrompt Attack

Adversarial Prompts

Carefully crafted inputs designed to exploit model weaknesses, cause unexpected behaviors, or probe for vulnerabilities. Related to OWASP LLM01:2025 (Prompt Injection) and documented across multiple MITRE ATLAS techniques.

OWASP Reference: LLM01
Learn more
HighPrompt Attack

System Prompt Extraction

Techniques used to reveal the hidden system prompt, instructions, or configuration that defines an LLM application's behavior. Introduced as a standalone category in OWASP LLM07:2025 (System Prompt Leakage), new to the 2025 edition.

OWASP Reference: LLM07
Learn more
MediumPrompt Attack

Model Extraction

Attacks designed to steal or replicate an LLM's capabilities, weights, or behavior through systematic querying. Documented across multiple MITRE ATLAS techniques and addressed by OWASP LLM10:2025 (Unbounded Consumption).

OWASP Reference: LLM10
Learn more
MediumPrompt Attack

Prompt Leaking

The unintended disclosure of conversation context, previous prompts, or multi-turn conversation history. Related to OWASP LLM02:2025 (Sensitive Information Disclosure) and LLM07:2025 (System Prompt Leakage).

OWASP Reference: LLM06
Learn more
MediumPrompt Attack

Context Manipulation

Attacks that exploit or corrupt the LLM's context window to alter behavior or access unauthorized information. Falls under OWASP LLM01:2025 (Prompt Injection) as a variant of direct prompt manipulation.

OWASP Reference: LLM01
Learn more
LowPrompt Attack

Denial of Service (LLM)

Attacks designed to exhaust LLM resources, cause excessive costs, or make the service unavailable. Classified as LLM10:2025 (Unbounded Consumption) in the OWASP Top 10 for LLM Applications, renamed from 'Model Denial of Service' in the 2023 edition.

OWASP Reference: LLM04
Learn more

Data Leakage

Exposure of sensitive information through LLM outputs

Content Violations

Attempts to generate harmful, toxic, or policy-violating content

Ready to protect your AI application?

Wardstone Guard detects all these threats in a single API call with Sub-30ms latency.