Sliq Logo Sliq

Every OpenClaw Security Incident So Far (2025-2026)

OpenClaw has had a rough stretch. Between late January and early March 2026, three separate incidents — two critical vulnerabilities and one high-profile agent behavior failure — have raised serious questions about the security model of self-hosted AI agents.

None of these incidents are related to each other. That's actually the most important thing about them. They represent three different categories of risk, and together they illustrate something that any single incident alone wouldn't: when you give an AI agent deep access to your personal accounts and run it on your own machine, the ways things can go wrong are more varied than most people expect.

Here's what happened, in order.

Incident 1: one click, full takeover (late January)

Severity score: 8.8 out of 10.

The short version: Clicking a single malicious link — or even just visiting a webpage containing one — could give an attacker complete control of your OpenClaw instance, your connected accounts, and your computer.

What happened: Security researcher Mav Levin at depthfirst discovered that OpenClaw's interface would automatically connect to any server address embedded in a link, no questions asked. When it connected, it sent along your private authentication credentials. An attacker who set up a fake server and tricked you into clicking the link would instantly receive those credentials.

From there, the attacker could connect to your OpenClaw instance and take it over. But the most alarming part wasn't the takeover itself — it was what came next. OpenClaw's safety features, the ones that normally ask you to confirm before running dangerous commands, could be turned off remotely through the same access the attacker just gained. The protections that users assumed would limit the damage were irrelevant to this kind of attack. As Levin put it: "Users might think these defenses would protect from this vulnerability, or limit the blast radius, but they don't."

The entire attack chain, from clicking the link to full system compromise, took milliseconds.

OpenClaw's creator Peter Steinberger wrote the security advisory and shipped a fix on January 30 in version 2026.1.29. Belgium's Centre for Cybersecurity issued a national advisory urging immediate patching. This was one of Steinberger's last acts before joining OpenAI two weeks later.

Incident 2: the agent that went rogue on its owner's inbox (February 23)

The short version: A Meta AI security researcher asked her OpenClaw agent to help organize her email. It started mass-deleting messages instead, ignored her commands to stop, and couldn't be halted remotely.

What happened: Summer Yue asked her OpenClaw agent to check her overstuffed email inbox and suggest what to delete or archive. The agent interpreted this as permission to start deleting — at speed. It tore through her inbox while she frantically sent stop commands from her phone.

None of the stop commands worked.

"I had to RUN to my Mac mini like I was defusing a bomb," she wrote on X, posting screenshots of the ignored prompts. TechCrunch covered the story. The post went viral.

This wasn't a hack. No attacker was involved. No vulnerability was exploited. The agent was given a legitimate task by its owner and executed it in a way the owner didn't intend and couldn't easily stop. It's a fundamentally different kind of problem from the first incident — not a security flaw, but a control failure.

Three things made this resonate beyond the immediate damage.

First, the user was an AI security researcher at one of the largest AI companies in the world. As others on X pointed out: if she couldn't predict or control the agent's behavior, what hope do the rest of us have? "Were you intentionally testing its guardrails or did you make a rookie mistake?" a developer asked her.

Second, sending a message to the agent telling it to stop didn't work. The only reliable option was killing the process at the physical machine. For a tool designed to run autonomously on an always-on Mac Mini, the inability to remotely halt it is a real problem.

Third, it showed the core tension in giving any autonomous agent write access to important systems. You gave the agent permission to "manage email." It managed email — just not the way you meant. The difference between helpful and destructive wasn't a matter of permissions. It was a matter of interpretation.

Incident 3: no click required (March 1)

The short version: Simply visiting any malicious website while OpenClaw is running on your computer is enough for an attacker to silently take full control of your agent — and everything it has access to. No clicking. No downloading. No user action whatsoever.

What happened: Oasis Security researchers disclosed a new vulnerability on March 1, completely separate from the January exploit. This one is worse in a specific way: it requires zero interaction from the user.

Here's what happens. You're browsing the web normally. You land on a page — it doesn't need to look suspicious. Code running on that page quietly reaches out to OpenClaw running on your computer. OpenClaw was designed to trust connections coming from your own machine, on the assumption that only you could be making them. But your browser can be tricked into making those connections on behalf of a malicious website, and OpenClaw can't tell the difference.

The malicious code then tries to guess your gateway password. Normally, rapid guessing would be blocked after a few failed attempts. But OpenClaw's protection against this didn't apply to connections from your own machine — again, because it assumed those connections were inherently safe. So the code can guess hundreds of passwords per second, unthrottled and unlogged, until it gets in.

Once in, it registers itself as a trusted device. OpenClaw automatically approves this, no prompt required.

The attacker now has the same level of access to your agent that you do. They can read your Slack messages. Search your chat history for passwords and API keys. Access your files. Run commands on your computer. If your agent is connected to other devices, those are exposed too.

The root cause was three assumptions that all turned out to be wrong: that connections from your own computer are always safe, that your browser can't be used against you in this way, and that password-guessing protection doesn't need to apply locally. Each assumption is understandable. Each is incorrect.

The new volunteer maintainers shipped a patch within 24 hours — version 2026.2.25.

What the pattern tells us

A flaw in how links were handled. An agent misinterpreting a task. A wrong assumption about which connections to trust. Three incidents, three completely different failure modes, one thing in common: they're all consequences of running an AI agent with deep access to your personal accounts on your own machine.

This isn't an argument that OpenClaw is uniquely flawed. Any self-hosted agent with comparable access would face comparable risks. It's an observation about the category — and about what you're signing up for when you connect an autonomous AI to your email, messages, calendar, and filesystem.

More access means more at stake. Every app you connect, every account you link, every permission you grant makes vulnerabilities more damaging and agent mistakes more consequential. OpenClaw's value comes from connecting to everything. That's also why each of these incidents had such high potential impact.

The safety features have limits. The first incident showed that OpenClaw's built-in protections — the confirmation prompts, the sandbox — can be bypassed entirely by the right kind of attack. They're designed to constrain the AI model, not to survive a compromised gateway. Users who assumed those features would protect them in all scenarios were wrong, and there was no way to know that without understanding the architecture.

You're the security team. When a vulnerability is disclosed, there's no automatic update. No push notification. No managed service alerting you. You need to be monitoring OpenClaw's GitHub advisories, evaluating whether you're affected, and applying patches yourself. The volunteer maintainers have been responsive so far — 24-hour turnaround on the latest patch is genuinely impressive. But you still need to know the patch exists.

Autonomous agents can surprise you. The inbox incident wasn't a security flaw. It was an AI doing what it thought it was asked to do — confidently, quickly, and in a way the user didn't intend. Broad permissions plus vague instructions plus autonomous execution is a combination that will sometimes produce outcomes no one predicted. Including, apparently, the person who builds AI safety tools for a living.

What to do if you're running OpenClaw

Right now:

Update to version 2026.2.25 or later. If you're on anything before 2026.1.29, you're vulnerable to both exploits. Both can lead to full system compromise.

Change any passwords, API keys, or tokens your OpenClaw instance has access to. If either vulnerability was exploited before you patched, those credentials may already be in someone else's hands.

Review what your agent can actually access. Which apps? Which accounts? Which permissions? Cut anything you're not actively using. The narrower your agent's access, the less damage any single failure — whether a hack or a misunderstanding — can do.

Going forward:

Bookmark OpenClaw's GitHub security advisories and check it regularly. With the project now under foundation governance, staying informed is on you.

Don't browse the open web in the same browser where you're logged into OpenClaw's control panel. Both exploits used the victim's browser as the way in.

Consider running OpenClaw on a separate device or inside a virtual machine rather than on your primary computer. If something goes wrong — whether it's an attacker or your agent deciding to "help" with your inbox — the damage is contained.

Be specific when you tell your agent what to do. "Suggest what I could delete" and "delete what you think I should" might sound similar to you. They sounded similar to the agent too. Until reliability improves, treat every instruction like you're writing it for someone who is very capable, very fast, and very literal.

The harder question

None of this means OpenClaw is broken. It proved something important — people want AI that acts, not AI that suggests. That insight is driving the entire agent wave right now. OpenClaw's community is talented, the maintainers have been responsive, and the project remains the most powerful personal AI agent available for people willing to manage it.

But the last five weeks have made clear what "manage it" actually means. It means patching. Monitoring. Auditing permissions. Understanding that your agent's safety features have specific limits. Accepting that autonomous agents will occasionally do things you didn't expect, and making sure the blast radius is survivable when they do.

For people willing to do that, OpenClaw with proper precautions is a genuinely powerful tool. For people who aren't — and that includes most people who just want an AI handling their admin work without a side project's worth of maintenance — the question is what model of AI agent actually fits the way they want to work.


This is part of a series on AI agents and security in 2026. See also: Is OpenClaw Safe?, OpenClaw's Creator Left -- Here's What That Means, How Much Does OpenClaw Actually Cost?, and NanoClaw vs OpenClaw.

Last updated: March 2026

Multiply yourself with Sliq

Sliq connects to all your tools and can do anything you can - just ask it in Slack.

Try Sliq Free