Garak Security Blog

Insights, research, and best practices for AI security, red teaming, and vulnerability assessment from the Garak team and community.

MCP
Model Context Protocol
AI Security
Prompt Injection
Code Injection
Enterprise Security
⭐ Featured
Bridging AI and Tools: How Attackers Exploit MCP (and How We Can Fight Back)
Garak Security Team
September 16, 2025
25 min read

Large Language Models that can browse the web, query databases, or send emails on our behalf – it sounds like science fiction, but with the Model Context Protocol (MCP), it's becoming reality. However, MCP isn't secure by default. Early implementations are riddled with vulnerabilities that could let bad actors turn helpful AI agents into dangerous conduits.

Google Gemini
Gmail Security
AI Vulnerability
Critical
Phishing
Enterprise Security
Google's Warning to 1.8 Billion Gmail Users: New AI Threat Wave Exposed
Garak AI Security Research
September 4, 2025
30 min read

Following Google's major warning about new wave of AI threats to 1.8 billion Gmail users, Garak exposes critical indirect prompt injection vulnerabilities in Gmail's AI features. 99.33% attack success rate discovered in Google Gemini 2.5 Pro's misleading information handling with direct implications for enterprise AI security and email protection systems.

GPT-OSS-20B
RCE
Vulnerabilities
Security
Open Source
Critical
Remote Code Execution in GPT-OSS-20B: Critical Vulnerabilities Exposed
Garak Security Team
August 21, 2025
25 min read

Our comprehensive security research reveals critical template injection vulnerabilities in GPT-OSS-20B with 100% RCE success rate. This detailed technical report covers 5 critical vulnerabilities, systematic red-team testing methodology, attack chain analysis, business impact assessment, and complete mitigation frameworks for security teams and developers.

GPT-5
Security
Vulnerabilities
Base64
Guardrails
GPT-5 Security Assessment: Stronger Than Expected, But Still Needs Guardrails
Garak Security Team
August 10, 2025
12 min read

We tested GPT-5 with 12 security attack types. While it showed strong resistance to jailbreaks, a critical vulnerability in Base64 decoding allows for complete safety bypasses. Here's our full assessment and the guardrails you need.

AI Agents
Security
Cybersecurity
Enterprise
Risk Management
The Rise of AI Agent Attacks: Why Traditional Security Fails
Garak Security Team
July 25, 2025
12 min read

As AI agents become increasingly autonomous and integrated into critical business processes, a new class of security threats has emerged that traditional cybersecurity approaches are ill-equipped to handle. Learn why 73% of deployed AI agents contain exploitable vulnerabilities and how organizations can protect themselves.

Security
LLM
Red Teaming
Vulnerabilities
Meta
Llama Guard
Bypassing Llama Guard: How Garak Could Have Detected Meta's Firewall Vulnerabilities
Garak Security Team
July 15, 2025
15 min read

In May 2025, Trendyol's application security team made a concerning discovery: Meta's Llama Firewall, a safeguard designed to protect large language models from prompt injection attacks, could be bypassed using several straightforward techniques. Learn how Garak's comprehensive testing framework could have proactively caught these vulnerabilities before they became public issues.