Skip to content
AI Securityllm-securityvulnerabilityCVEVulnerability Research

ChromaDB ChromaToast: Remote Code Execution Before Authentication

4 min read
Share

ChromaDB ChromaToast hands you RCE before it asks who you are

HiddenLayer published the ChromaToast research today, and the structural shape of the bug is the cleanest illustration this year of why "auth-after-init" is hostile for AI infrastructure.

CVE-2026-45829 sits in the Python FastAPI server packaged with ChromaDB, the widely deployed open source vector database that backs retrieval augmented generation pipelines, agent memory stores, and embedding search for organisations including Mintlify, Factory AI, and Weights and Biases. The package pulls roughly 14 million monthly downloads on PyPI. The bug affects every version from 1.0.0 through 1.5.8 and was unpatched at the time of disclosure.

The bug, one paragraph

The ChromaDB FastAPI server instantiates user controlled embedding function configuration before the authentication middleware fires. An attacker hits the HTTP API unauthenticated, supplies a malicious HuggingFace model reference, and the model loader is the remote code execution primitive. The loader trusts whatever model identifier the client sends, fetches it, instantiates it, and returns control to the server process with attacker controlled code already running.

Why this hurts more than a normal CVE

Three things compound here.

First, the package surface is enormous. 14 million monthly PyPI downloads, and roughly 73% of internet accessible deployments are exposed according to BleepingComputer's read of the data. Reported to ChromaDB on February 17. Still unpatched at the moment HiddenLayer published.

Second, ChromaDB sits in a part of the stack that routinely holds the keys to everything else. RAG pipelines need API keys for embedding models, vector database credentials, AWS or Azure secrets for object storage, and increasingly, MCP server tokens. The blast radius of a ChromaDB process compromise is whatever the AI runtime has access to, which is usually everything.

Third, the architectural pattern is replicable. Any AI vector store, embedding service, or model gateway that takes user controlled model references as input and instantiates them before authenticating the caller has the same bug. ChromaToast is the disclosure that names the pattern, not the only place it lives.

The defender posture

Three actions, in order of speed.

Switch to the Rust frontend, which is not affected. If your deployment can move, this is the fastest closure.

If you cannot move, restrict the API port to internal network only. The exposure is HTTP based; if the loader is not reachable from the public internet, the unauthenticated path is not reachable.

If you cannot do either, front the ChromaDB server with an authenticating reverse proxy that requires a valid session before the request hits FastAPI. This is the messiest option but it works as a stopgap while a patch lands.

After the patch, audit. Rotate every API key, vector database credential, and downstream token that has touched a ChromaDB instance in the trailing 90 days. The disclosure has been public since February 17, and the exposure window for any internet accessible deployment runs at least that long.

The architectural lesson

Auth-after-init is the trap. Any service that does meaningful work, instantiates an object, loads a model, opens a connection, on data received from the network before authenticating the caller has built a trust boundary that an attacker can climb. The Microsoft Defender CVE-2026-41091 disclosure that landed the same week is the same lesson at a different layer: the antimalware engine has broad file system trust, and the engine itself becomes the elevation primitive when an attacker can reach the trusted code path before the access check.

For AI infrastructure specifically, the model loader is the most dangerous piece of the request pipeline because it is the most expressive. Anything that takes a model reference can in principle fetch, deserialize, and execute arbitrary code from the configured registry. Gate the loader behind authentication. Not the other way around.

What to watch for next

Vendor analyses will compound. Hadrian has the second independent writeup already. Expect Snyk, Wiz, and Aikido to land their own technical reads within the week, and expect the first in the wild exploitation campaign to show up shortly after a patch publishes. The window between patch publication and exploitation development is now measured in hours for any disclosure with a 14 million download PyPI package as the carrier.

If you operate a RAG stack, agentic workflow, or vector search backend in a regulated sector, this is a same day defender drill, not a Patch Tuesday item.

Gigia Tsiklauri is a Security Architect and founder of Infosec.ge. Get in touch if your AI infrastructure procurement process needs a second opinion on architectural trust boundaries.