prompt-injection-types

Issue: #127 Date: 2026-01-30T19:09:52-08:00

Research Findings: Top 10 Prompt Injection Vulnerabilities

Summary

Prompt injection has emerged as the #1 critical vulnerability in the OWASP Top 10 for LLM Applications (2025), affecting over 73% of production AI systems. Unlike traditional software vulnerabilities, prompt injection exploits the fundamental design of language models where instructions and data are both processed as natural language text. This research identifies the most prevalent prompt injection vulnerability types and attack patterns observed in modern AI deployments.

Key Findings: Top 10 Prompt Injection Vulnerability Types

1. Direct Prompt Injection (Jailbreaking)

Attackers directly craft malicious prompts to override or reveal the underlying system prompt, bypassing security guidelines and gaining unauthorized access to backend systems. This is the most straightforward form of prompt injection and remains widely exploited.

2. Indirect Prompt Injection

Malicious prompts are embedded in external content sources (websites, emails, PDFs, documents) that the LLM processes later. When the model ingests this content, the hidden instructions hijack the conversation context. The EchoLeak attack on Microsoft 365 Copilot is a notable example of zero-click exploitation.

3. Data Exfiltration via Prompt Injection

Attackers craft prompts that manipulate the LLM into exfiltrating sensitive data by leveraging available functions (web browsing, email, file access). This includes character-by-character exfiltration through image request sequences and exploiting memory features for long-term data theft across multiple conversations.

4. Code Execution Attacks

Prompt injection enables arbitrary code execution, as demonstrated by the GitHub Copilot CVE-2025-53773 vulnerability (CVSS 9.6), where injected prompts instructed the IDE to modify configuration files and execute unintended commands.

5. System Prompt Extraction

Attackers use crafted prompts to extract or reveal the model’s underlying system instructions, enabling further attacks by understanding the model’s rules, constraints, and available capabilities.

6. Multi-Language and Encoding Evasion

Attackers use multiple languages, Unicode homoglyphs, Base64 encoding, emoji encoding, typos, and code-switching to bypass content filters and detection systems designed to prevent prompt injection.

7. Fake Completion Attacks

Attackers provide precompleted answers or context to the LLM (e.g., “continue this story…”) that explicitly ignore template instructions and force the model to follow the attacker’s narrative instead of intended guidelines.

8. Rapid Iteration and Variation Testing

Attackers systematically generate and test numerous prompt variations with minor modifications (random capitalization, character spacing, word shuffling) until one bypasses safety measures, exploiting the stochastic nature of LLM responses.

9. Supply Chain and Plugin-Based Injection

Prompt injection vulnerabilities in third-party AI chatbot plugins, integrations, and orchestration layers allow attackers to compromise LLM applications through compromised dependencies and integrated services.

Malicious prompts are embedded within images, audio, or video files that the LLM scans or processes. The model executes injected instructions hidden within multimedia content, creating attacks that are less visible to human users.

Vulnerability Statistics

Prevalence: Over 73% of production AI deployments assessed contain prompt injection vulnerabilities
Defensive Gap: 65.3% of organizations lack dedicated defenses against prompt injection
Security Testing: Less than 40% of organizations conduct regular security testing on AI models and agent workflows
Current Position: Ranks #1 on OWASP Top 10 for LLM Applications 2025 (and has held this position since the list’s inception)

Real-World Incidents

GitHub Copilot CVE-2025-53773 - Remote code execution via prompt injection in code comments
EchoLeak (Microsoft 365 Copilot) - Zero-click prompt injection allowing data exfiltration via crafted emails
Vanna AI - Remote code execution through SQL query generation manipulation
ChatGPT Memory Exploitation - Persistent prompt injection enabling long-term cross-conversation data theft
ChatGPT Windows License Key Exposure - Sensitive credentials exposed through prompt manipulation

Key Challenges & Recommendations

Why Prevention is Difficult:

Prompt injection exploits the fundamental design of LLMs (everything is processed as text)
The stochastic nature of LLM responses makes foolproof prevention unclear
OpenAI has stated prompt injection is “unlikely to ever be fully solved,” similar to social engineering

Mitigation Best Practices:

Privilege Restriction - Limit LLM access to the minimum necessary permissions and functions
Context Separation - Clearly denote and isolate untrusted content from user prompts
Human-in-the-Loop Controls - Implement approval workflows for sensitive operations
Input Validation - Validate and sanitize inputs from external sources before processing
Regular Security Testing - Conduct systematic testing of AI models against prompt injection attacks
Monitoring & Logging - Track unusual model behavior and maintain detailed audit logs
Defense-in-Depth - Combine multiple defense layers rather than relying on single mitigations