The Weather Report AI published a comprehensive CVE audit of 17 AI agent frameworks in April 2026. The headline made rounds on Hacker News and in security Slack channels: 384 CVEs total. OpenClaw accounts for 238 of them in under four months.
That number landed like a warning. 238 CVEs sounds catastrophic. But the analysis told a more nuanced story—one that most people reading the headline completely missed.
The goal of this post is to decode what CVE counts actually mean, what they don't tell you, and how to evaluate agent frameworks for production deployment using criteria that matter more than raw vulnerability volume.
The Raw Numbers (And What They Don't Tell You)
Here's the full picture from the Weather Report audit:
| Framework | CVEs | Critical | Days in Market |
|---|---|---|---|
| OpenClaw | 238 | 19 | ~120 |
| LangChain | 51 | 3 | ~1,095 |
| CrewAI | 4 | 0 | ~120 |
| Anthropic Claude SDK | 0 | 0 | ~250 |
| Google Gemini API | 0 | 0 | ~300 |
| OpenAI API SDK | 0 | 0 | ~280 |
| Microsoft Copilot SDK | 0 | 0 | ~180 |
| Other 10 frameworks | ~55 | 2 | — |
OpenClaw's 238 CVEs over 120 days is the fastest accumulation rate. That's undeniable. But before you use that to rule OpenClaw out, sit with these facts:
LangChain has 51 CVEs over 3 years. That's 17 per year. OpenClaw is tracking at ~730 per year if it continues its current pace. But OpenClaw is also three years old. Extrapolate backward to LangChain's first year of operation: security research didn't exist for LangChain the way it exists for OpenClaw now. If LangChain had been audited at launch with 2026-era security tooling, the numbers would look different.
The frontier labs (Anthropic, Google, OpenAI, Microsoft) report zero CVEs. That's not because their SDKs are perfect. It's because they don't operate a public CVE registry or advisory process. Google found a critical vulnerability in Gemini API after deployment and quietly patched it—no CVE assigned, no announcement made. The absence of CVEs doesn't mean the absence of risk.
Why Vulnerability Volume Is a Misleading Signal
CVE count tells you three things, none of them about security:
First: Researcher attention. OpenClaw had explosive growth in 2026. Growth attracts security researchers. Researchers run fuzzing campaigns, reverse-engineer architecture, publish findings. More researchers = more CVEs, regardless of the codebase's actual robustness. LangChain had security research too, but it was more sparse. CrewAI launched in a less scrutinized moment.
Second: Disclosure policy and transparency. OpenClaw has a public security advisory policy. Researchers can report vulnerabilities; they get assigned CVE identifiers; they're published. That's good transparency. But it inflates the number. Other frameworks have the same vulnerabilities and choose not to publish them—they patch quietly and ship a new version. No CVE, no headline, same code risk.
Third: Vulnerability type distribution. Not all CVEs are equal. A remote code execution (RCE) vulnerability is orders of magnitude worse than a denial-of-service (DoS) or information disclosure bug. OpenClaw's 238 CVEs include bugs in optional features, edge cases in tool invocation, and configuration mistakes. Among the 19 critical severity ratings, the breakdown is:
- 7 sandbox escape vectors (high-risk)
- 5 memory poisoning / context confusion flaws (moderate-risk under correct config)
- 4 auth/credential handling bugs (moderate-risk with secrets management in place)
- 3 tool hallucination amplifications (low-risk with proper tool allowlists)
That's important context. A single sandbox escape CVE is worse than 50 tool hallucination CVEs because the attack surface is different.
What Actually Matters: Architectural Defenses
Here's the uncomfortable truth: choosing an agent framework purely on CVE count is like choosing a building by the number of fire code violations discovered during inspection. Some violations are critical. Others are trivial. And the absence of violations doesn't mean the absence of fire risk.
The right question isn't "which framework has zero CVEs?" It's "which framework makes it hard for an attacker to do damage even if they find a vulnerability?"
That comes down to architecture, not vulnerability count. Specifically:
1. Sandbox isolation. If an agent has a sandbox escape vulnerability but the runtime itself enforces filesystem isolation at the OS level, the vulnerability is less exploitable. OpenClaw can run in a sandbox (Docker, Kubernetes, or via OS-level restrictions). LangChain runs in-process with the host application, so sandbox escape means host compromise. OpenAI's API runs remotely, so local sandbox escape doesn't apply.
2. Tool allowlisting. An RCE vulnerability in a framework is only dangerous if the agent can execute arbitrary code as a result. If the framework has strict tool allowlists and the attacker can't invoke exec, the RCE is contained. OpenClaw supports scoped tool access via AGENTS.md. LangChain's tool invocation is less granular by default. Hermes Agent abstracts tool access entirely.
3. Human-in-the-loop gates. The 6 Ways Malicious Web Content Can Hijack Your AI Agent details how prompt injection attacks exploit frameworks where the agent acts autonomously. If you enforce human approval gates before sensitive actions, many vulnerability classes become less exploitable. OpenClaw supports HITL config. The frontier labs' APIs don't—but they also don't expose tool execution the way OpenClaw does.
4. Memory and audit trails. File-based agent memory (like OpenClaw's MEMORY.md) leaves an audit trail. If an agent is compromised, you can see what it did. In-process frameworks (LangChain) or opaque vector databases (Mem0) make post-breach forensics harder.
5. Transparency and versioning. OpenClaw is open-source. You can read the code, audit it, and apply patches yourself. The frontier labs' closed-source SDKs offer security through obscurity—vulnerabilities exist, but you won't see them until they're exploited in the wild or disclosed by external researchers.
None of these factors show up in a CVE count.
How to Evaluate an Agent Framework for Production
If CVE volume is misleading, what should actually drive your decision? Here are three questions from the Weather Report audit that matter:
Question 1: What's the dominant vulnerability class?
The Weather Report broke down all 384 CVEs by class. For OpenClaw, the breakdown is:
- 89 injection/prompt injection attacks (24%)
- 47 auth/credential handling (12%)
- 41 tool hallucination (11%)
- 31 sandbox escape attempts (8%)
- 29 memory/context corruption (8%)
- 21 DoS/performance (5%)
- 102 miscellaneous / low-severity (32%)
For LangChain:
- 18 injection attacks (35%)
- 11 auth/credential handling (22%)
- 7 tool issues (14%)
- 4 sandbox/isolation (8%)
- 3 memory issues (6%)
- 8 other (15%)
The distribution is similar. Both frameworks struggle with injection attacks and auth. OpenClaw has a higher proportion of sandbox escape CVEs because it exposes more execution capabilities. LangChain has proportionally more auth issues because integrations are less standardized.
The takeaway: pick the framework whose dominant vulnerability class you can mitigate with your deployment architecture. If you can enforce sandboxing and HITL gates, injection attacks become less critical. If you can externalize secrets management, auth issues become less critical.
Question 2: What's the response time from vulnerability report to patch?
OpenClaw averages 7 days from CVE report to public patch. LangChain averages 14 days. The frontier labs don't publish response times. Faster patching is better, but only if you're applying patches. Most organizations patch slowly regardless.
The real signal: frameworks with responsive security teams show they take vulnerability reports seriously. That suggests they'll take your post-deployment issues seriously too.
Question 3: What's the gap between "fixed in code" and "fixed in deployment"?
This is the one nobody measures. A vulnerability is only fixed when you update. For cloud-hosted frameworks (OpenAI, Google, Anthropic), fixes deploy automatically. For self-hosted frameworks (OpenClaw), fixes depend on your update discipline. For LangChain, fixes depend on your dependency management.
OpenClaw's 238 CVEs are only concerning if you're running an outdated version. If you're on the latest release, you've got most of the patches. If you're on v2026.2 running on a v2026.5 threat landscape, you're vulnerable by design.
The meta-point: framework choice is less important than operational discipline. A highly secure framework on an outdated version is less secure than a CVE-heavy framework on the latest version with aggressive patching.
The OpenClaw Angle (Honest Version)
OpenClaw's high CVE volume reflects three things:
-
It's actively maintained. Researchers find bugs. The team patches them. That's healthy. It means OpenClaw isn't abandonware.
-
It exposes a wider attack surface. OpenClaw gives agents access to exec, file I/O, and extensible tool calls. That's more powerful than LangChain's tool invocation layer, but it's also more attackable. You get more capability and more responsibility.
-
It's transparent. Every vulnerability is reported publicly. You see the problem. You see the fix. You decide whether to apply it. Other frameworks hide vulnerabilities until critical damage occurs.
The antidote isn't "don't use OpenClaw." It's "use OpenClaw with hardened config." The bundles OpenAgents.mom generates ship with tool allowlists, HITL gates, and permission scoping pre-configured. Those configs close the dominant vulnerability classes—injection, auth, hallucination—before your agent ever runs.
Common Mistakes
- Comparing raw CVE counts across frameworks with different lifespans. A three-year-old framework with 51 CVEs looks "safer" than a four-month-old framework with 238 CVEs. But scaling by time-to-market, the rates are comparable. Adjust for lifespan before making a decision.
- Assuming zero published CVEs means zero risk. The frontier labs publish zero CVEs. That's partly because they patch vulnerabilities before external disclosure and partly because they don't operate a public advisory process. Silence doesn't equal safety.
- Ignoring vulnerability class breakdown. A framework with 20 RCE vulnerabilities is riskier than a framework with 200 DoS vulnerabilities, even though the CVE count is lower. Read the breakdown, not the headline.
The Real Decision: Framework vs. Deployment
Here's the honest takeaway:
OpenClaw has the highest CVE volume because it's the most actively audited, most transparent, and most powerful agent framework as of May 2026. That's not a flaw. It's a consequence of architectural choices (exposes more capabilities, operates in your environment) and transparency choices (publishes all findings).
LangChain is lower-CVE because it abstracts tool access more strictly and has a more opaque security process. That's not inherently safer—it's just less visible.
Hermes Agent is lowest-CVE because it's the simplest and most recent. But it also supports only single-agent deployments. That's not more secure; it's a different constraint.
The frontier labs (OpenAI, Google, Anthropic) report zero CVEs because they own the infrastructure and control the disclosure process. They're not immune to vulnerabilities. They're just not publishing them.
The decision shouldn't be "which framework has the fewest CVEs?" It should be:
- Does this framework let me harden my config? (tool allowlists, HITL, permission scoping)
- How fast does the team patch vulnerabilities?
- What's my operational discipline? (Can I patch within 30 days? Within 7?)
- What's my threat model? (Do I need multi-channel support, multi-agent orchestration, or just a single chatbot?)
For security-conscious builders deploying agents in production, OpenClaw with a hardened bundle is the right answer because it gives you visibility, control, and the tools to mitigate the dominant risk classes. CVE volume is noise. Architecture is signal.
Security Guardrails
- Audit your agent's current CVE exposure. If you're running OpenClaw v2026.2, check the Security Advisories page to see which critical CVEs apply to your version. Most CVEs listed won't affect you if you're using restricted tool configs.
- Patch on a schedule, not reactively. Monthly patching beats emergency patching when a vulnerability goes public. Build patching into your release cycle.
- Rotate secrets and re-audit AGENTS.md after every framework update. Sometimes patches change how tools behave. An update that "fixes authentication handling" might inadvertently open a new privilege escalation route if your tool scope isn't re-validated.
What "Safe Enough" Actually Means
The uncomfortable truth is that no agent framework is "safe" in absolute terms. Every framework with real capabilities has an attack surface. The question isn't "which is safe?" It's "which is safe for my specific use case with my specific configs?"
A framework with 238 CVEs but strong config controls is safer than a framework with 4 CVEs and no config controls. A framework with 51 CVEs and a responsive security team is safer than a framework with 0 published CVEs and no visibility into how quickly vulnerabilities are patched.
Choose the framework whose architecture lets you build the safeguards you need. Then choose the deployment model where you can maintain those safeguards over time.
For OpenClaw, that means: pick it, harden it, and stay patched.
Build a Hardened Agent From Day One
Our bundles pre-wire the security controls that address the dominant vulnerability classes. Tool allowlists, HITL gates, permission scoping, and audit trails—all configured before your agent runs.