Bleeding Llama in Ollama
Defending Local LLM Runtimes from Unauthenticated Process-Memory Disclosure
Cyera Research disclosed an unauthenticated process-memory disclosure vulnerability in Ollama, one of the most widely deployed local LLM runtimes. An attacker with network access to an exposed Ollama instance can upload a crafted GGUF model file to trigger a heap out-of-bounds read, leaking unintended process memory — including user prompts, system prompts, API keys, and environment variables. With roughly 170,000 GitHub stars, more than 100 million Docker Hub downloads, and an estimated 300,000 servers globally, the exposure surface is significant.
"Local LLM runtimes can expose high-value secrets even when the model itself is not compromised."
— Cyera Research
Members only
Full technical analysis, attack chain, IOCs, and the defensive checklist are available to registered members — free to join.