Skip to main content
Recruitment scam alert: Brew Digital is being impersonated | Read more
Read more
What OpenClaw's revealed about the security of AI agents
Share on socials
A photo of Neal Riley with the title OpenClaw's three-layer AI security failure.
Photo of Neal Riley
Neal Riley
Published on 6 February 2026

What OpenClaw's revealed about the security of AI agents

1.49 million records exposed. Rogue skills. One-click remote takeover. OpenClaw’s explosive rise revealed a three-layer security collapse that every AI agent user needs to understand.
Two SQL statements. That was all it would have taken to secure the Moltbook database. Instead, 1.49 million agent records, including API keys, claim tokens and verification codes, sat exposed to anyone with a browser and the curiosity to look. When security researcher Jamieson O'Reilly reported the vulnerability to Moltbook's creator, the response was disarmingly honest: "I'm just going to give everything to AI. So send me whatever you have."
That exchange captures something essential about the OpenClaw phenomenon. In the last week of January 2026, a hobbyist AI assistant accumulated 149,000 GitHub stars, spawned a social network of 770,000 autonomous agents and attracted the attention of at least fourteen independent security researchers. Their findings, taken together, amount to a comprehensive audit of what happens when capability outruns caution. The tool failed not in one way but in three: at the perimeter, at the identity layer and at the data layer, and the failures compounded.

Perimeter: a non-existent boundary

Start at the boundary. O'Reilly used Shodan, a search engine for internet-connected devices, to scan for OpenClaw control servers. He found over 1,000 exposed gateways; eight had no authentication whatsoever.
The architecture made this worse. OpenClaw auto-approved connections from localhost, so any instance deployed behind a standard reverse proxy appeared to authenticate all traffic by default. Months of private messages, API keys and credentials across Signal, Telegram, Slack and WhatsApp lay open to anyone who knew where to look. The perimeter, in short, did not exist.

Identity: trust without verification

Move inward to identity, the question of who can do what. OpenClaw's skill marketplace, ClawdHub, trusted code based solely on download counts. O'Reilly proved this by publishing a skill called "What Would Elon Do" that executed code on users' systems, then inflating its download count by 4,000 with a script. Developers from seven countries installed the poisoned package before anyone noticed.
"When you compromise a supply chain," O'Reilly observed, "you're not asking victims to trust you, you're hijacking trust they've already placed in someone else."
Elsewhere, the Twitter plugin extracted session tokens directly from Chrome's cache. The Signal integration stored pairing credentials in globally readable temporary files, enabling complete account takeover. Identity, like the perimeter, was an afterthought.

Data and application: an unbounded attack surface

The innermost layer, data and application security, introduced vulnerabilities both familiar and novel. Hudson Rock found credentials stored in plain text Markdown files at predictable file system paths; infostealer malware families had already begun targeting them.
The more significant finding was architectural. A researcher demonstrated that a carefully crafted email could trick an OpenClaw instance into forwarding its owner's inbox to an attacker-controlled address. No code, no exploit, just text. Steinberger, the creator of OpenClaw, acknowledged prompt injection as "an industry-wide unsolved problem".
Palo Alto Networks went further, identifying persistent memory as an amplifier. Malicious payloads could be fragmented across time, stored as benign inputs and assembled later into executable instructions. They called it "time-shifted prompt injection". Their conclusion was blunt: "Moltbot is an unbounded attack surface with access to your credentials."

When layers fail together

These layers do not fail independently. Security researcher Mav Levin demonstrated this by chaining an unvalidated URL parameter with a missing origin check on WebSocket connections into a one-click remote code execution exploit. A victim visits a malicious web page; milliseconds later, the attacker has full control of their machine. The exploit threaded through all three layers, perimeter, identity and data, in a single action.

Enterprise adoption outpaced security

The speed of enterprise adoption made these failures acute. Token Security reported that within one week, 22% of its customers had employees actively using OpenClaw variants. Noma Security found that more than half of its customers had users granting the tool privileged access without approval. Security teams did not deploy the tool; they inherited it. By early February, Gartner had declared OpenClaw an "unacceptable cybersecurity risk" for enterprise use.

The deeper problem with AI agents

The broader lesson is not that OpenClaw was uniquely reckless. It is that the security model for AI agents remains unsolved. These tools derive their value from connectivity to email, messaging, calendars, browsers and file systems, and each integration is an entry point.
As O'Reilly put it: "AI agents tear all of that down by design. They need to read your files, access your credentials, execute commands and interact with external services. The value proposition requires punching holes through every boundary we spent decades building."Heather Adkins, VP of Security Engineering at Google Cloud, offered the simplest summary: "My threat model is not your threat model, but it should be. Do not run Clawdbot."