
- Published on
- · 11 min read
Is OpenClaw Safe? What the Malwarebytes Report Gets Right and Misses
- Authors

- Name
- Chris Kvamme
- @MidnightBuild12
Yes, you can run OpenClaw safely. But you need to configure it correctly, and you need to understand the real threat landscape. On February 23, 2026, Malwarebytes published a security analysis that asked exactly the right question, then gave an incomplete answer. They covered the risks. They didn't mention the defenses. This article fills that gap.
Here's what this covers:
- Three points where the Malwarebytes analysis is missing context
- What they got right (and it's not nothing)
- The built-in tools OpenClaw ships that the report never acknowledged
- What you should actually do to run it safely
What Malwarebytes Got Right
Before getting into the gaps, I want to be clear: the concerns in that article are real. The Hudson Rock infostealer case they describe is real. Token data and email content ended up in logs. One distinction worth making early: the attack vector was infostealers on users' machines, not a vulnerability in OpenClaw's server software. The tokens were stolen from local systems and then used to access OpenClaw instances. That's not "OpenClaw got hacked." That's "users got hacked and their OpenClaw access was part of the fallout." The ClawdBot/Moltbot rename sparked a real impersonation campaign. Malwarebytes was actually the outlet that broke that story. Their earlier reporting held up.
Independent researchers put the exposure numbers beyond dispute. CybersecurityNews reported 312,000+ OpenClaw instances on Shodan as of February 18, 2026, many with no authentication. Cyera Research Labs found 3,746 of 24,478 distinct servers had exposed mDNS services. Those numbers are a real problem.
Microsoft's recommendations for reducing blast radius, which Malwarebytes cited, are sensible. Grant least privilege. Avoid giving agents access to things they don't need. Review what skills are installed.
The ClawdBot/Moltbot renames created a real supply chain attack surface. Snyk's ToxicSkills research found 36% of ClawHub skills contain security flaws, with 1,467 vulnerable payloads. JFrog found a VS Code trojan exploiting the naming confusion. ClawHub now requires publisher accounts to be at least one week old before posting skills, which narrows the window for drive-by impersonation. But the supply chain risk is real, and it's not unique to OpenClaw. Snyk's own study covers Claude Code, Cursor, and OpenClaw in the same breath.
All of that is accurate. The problem is they stopped there.
Where the Analysis Falls Short
The Security Model Isn't "Improvised"
The Malwarebytes piece implies OpenClaw's security posture is ad hoc. That's not accurate.
OpenClaw has published TLA+ formal verification models covering its gateway exposure, pairing store, ingress gating, routing isolation, and node command pipelines. The models are on GitHub. To be clear about what that means and doesn't: TLA+ verifies that the authorization logic behaves correctly under its assumptions. It does not prove the agent can't be tricked into doing something harmful. It proves the gatekeeping layer works as designed. Those are different things, and both matter. The CHANGELOG shows active security audit expansions, including Windows allowlist enforcement, open group policy detection, and pairing-required guidance. This isn't a project where someone added a password field and called it secure.
Now, I want to be precise here, because there's a real number that gets thrown around: EarlyCore ran 629 security tests on an OpenClaw deployment and found 80% of hijacking attacks succeeded. That number is real. It's also a partial-hardening result.
EarlyCore's own report notes the configuration they tested: a Gemini 3 model (which OpenClaw's security audit flags as weak-tier), sandbox mode turned off, and exec approvals not configured. Their footnote says outright: "Results show what happens with partial hardening. Full defense requires all 9 layers." The 80% figure describes what happens when you skip three of them. Unhardened instances hit 100%. Partial hardening dropped that to 80%. Imperfect. Measurable. Not zero.
Worth noting: EarlyCore documented nine distinct defense layers to test against. A project with nine layers of security controls is not running an improvised security model.
What about the 312,000+ instances with open ports? That number is real, but it doesn't mean the platform ships without defaults. OpenClaw defaults to localhost binding and pairing. Users are opening those ports themselves. You can find millions of misconfigured Redis, MongoDB, and WordPress instances on Shodan. Nobody argues those products have no security model. OpenClaw needs better onboarding around security defaults. That's a legitimate criticism. It's not the same as saying the security model doesn't exist.
The distinction that matters is this: the authorization layer of OpenClaw (who can connect, how pairing works, what tools are available) is documented and formally verified. Prompt injection defense is incomplete. But prompt injection defense is incomplete everywhere. Those are different statements, and Malwarebytes conflates them.
Prompt Injection Isn't Specific to OpenClaw
Malwarebytes is right that prompt injection is a real danger for tool-using agents. Where the framing goes wrong is presenting it as though it's a flaw in how OpenClaw was built. It's not. It's OWASP's LLM01, the top risk for every LLM-based system, affecting ChatGPT, Bard, Copilot, and every other system that processes untrusted input.
Vectra AI documented attack success rates of 50 to 84% against frontier models, as of February 2026. They note no complete fix exists. ZDNET put it plainly: unlike SQL injection, which has parameterized queries as a standard defense, prompt injection has no equivalent solution. An MDPI peer-reviewed paper from January 2026 documented prompt injection leaking credentials from Microsoft Copilot's access to private repos.
Now, there's a fair counterpoint here, and I'm not going to dodge it: the consequences of prompt injection in OpenClaw are worse than in a pure chat system. When ChatGPT gets manipulated, it might say something wrong. When OpenClaw gets manipulated, it can execute shell commands, delete files, or exfiltrate API keys. Penligent.ai describes this well: "Prompt injection is no longer just a content moderation issue; it is an authorization problem disguised as a language problem."
The blast radius is real. But it's a property of any tool-using AI agent, not OpenClaw specifically. Microsoft Copilot has file access, email access, and Teams integration. Google's Gemini agents have access to Drive, Gmail, and Calendar. OpenAI's code interpreter can execute arbitrary code. The blast radius exists wherever an LLM has tool access.
Malwarebytes doesn't compare OpenClaw to these alternatives. OpenClaw's per-tool security levels (deny, allowlist, and full, per tool, per channel) give more granular control than most hosted agent platforms, but the underlying vulnerability is the same across all of them. Saying "this category is dangerous and OpenClaw is in that category" is fair. Saying "OpenClaw is uniquely dangerous" without establishing a baseline is not.
They Never Mentioned the Safety Tools
Malwarebytes lists Microsoft's third-party recommendations for reducing agent risk: grant least privilege, restrict what the agent can access, review installed tools. It doesn't mention that OpenClaw ships tools implementing several of those recommendations directly.
What OpenClaw includes:
openclaw security auditwith--deep,--fix, and--jsonflags. It detects exposed gateway ports, weak permissions, missing auth, and open group policies at runtime.- Pairing mode is on by default. The default
dmPolicyis "pairing," meaning the agent requires explicit approval before responding to new contacts. - Per-tool security levels: each tool can be set to deny, allowlist, or full access independently.
- Exec approvals: destructive commands require human confirmation before execution.
- Sandbox mode: restricts what the agent can run even when it tries.
- Tool denylists: specific tools can be removed from availability entirely.
JFrog published their own security analysis of OpenClaw and explicitly recommended running openclaw security audit as part of their findings. A practitioner writing on Dev.to described the allowlist system this way: "Only my WhatsApp number can talk to the agent. A random attacker can't just send it commands."
A reader of the Malwarebytes article would come away with no awareness that these controls exist. That's a gap in the analysis.
What This Article Is Not Claiming
This is not a defense brief. A few things I want to be explicit about:
- I'm not claiming OpenClaw is safe by default for non-technical users. It isn't. Safe deployment requires configuration work that most casual users won't do.
- I'm not claiming prompt injection is solved. It's not solved anywhere. Not in OpenClaw, not in Copilot, not in any LLM agent system shipping today.
- I'm not claiming that built-in controls guarantee safe deployment. They reduce attack surface. They don't eliminate it.
- I'm not claiming Malwarebytes was wrong to raise concerns. They were right to raise them. They were incomplete in how they presented the picture.
OpenClaw is not uniquely broken, but it is also not a casual-consumer product. Safe use depends on disciplined configuration. If that sentence makes you uncomfortable, the Malwarebytes article's caution is probably the right default for you. If you're willing to do the work, read on.
What to Actually Do
Reading Malwarebytes' article and doing nothing is the wrong response. So is reading it and uninstalling OpenClaw. Here's what actually matters:
Run the security audit first. The security audit guide on this site walks you through openclaw security audit --deep and what the output means. EarlyCore's 80% number was against a deployment that skipped this step.
Set allowlists for channels that don't need full tool access. If your WhatsApp channel doesn't need filesystem access, remove it. The hardening playbook has specific config examples.
Enable exec approvals for destructive commands. This is the human-in-the-loop control. It won't stop prompt injection, but it slows down the blast radius.
Vet ClawHub skills before installing. The ClawHub safe install playbook covers what to check. Don't install skills from accounts with zero history.
Use a VPS with locked-down network access if you're running this outside your home network. Docker-based setups are easier to isolate.
None of this eliminates prompt injection risk. That risk is unsolved for every LLM agent product available today. But it reduces the attack surface to a size that most threat models can tolerate.
Key Terms
Prompt injection is an attack type where malicious text embedded in an LLM's input manipulates the model's output or actions, bypassing intended behavior.
TLA+ is a formal specification language used to mathematically verify logical correctness in distributed and concurrent systems.
openclaw security audit is OpenClaw's built-in command-line tool for detecting security misconfigurations in a running deployment.
ClawHub is the community skill marketplace for OpenClaw, where users share and install third-party agent capabilities.
FAQ
Is OpenClaw safe to install?
It's safe to install. Whether it's safe to run depends on how you configure it. Defaults include localhost-only binding and pairing, which are good starting points. Running openclaw security audit --deep on a new deployment will flag the things that need fixing before you expose it to the internet or give it broad tool access.
Did OpenClaw get hacked?
No. The incidents in the Malwarebytes report involved infostealers on users' machines, not a breach of OpenClaw's software. Stolen tokens from local systems were used to access those users' OpenClaw instances. If someone steals your SSH key, that's not an OpenSSH vulnerability. Same principle here.
Is prompt injection a real risk in OpenClaw?
Yes. It's also a real risk in Microsoft Copilot, ChatGPT with tool access, Google Gemini agents, and every other LLM-based system that processes external input. OWASP lists it as the top risk for all LLM systems. For agents with tool access, the consequences are more severe than for chat-only systems. Use exec approvals and tool restrictions to limit what a compromised session can actually do.
What does openclaw security audit actually check?
It checks for exposed gateway ports, weak permissions, missing auth configuration, open group policies, and (with --deep) runtime tool configurations that leave you exposed. The security audit guide has the full breakdown of what each flag does and how to fix what it finds.
Should I trust ClawHub skills?
With the same skepticism you'd apply to any community package. Check the publisher's account age. Read the source if it's available. Prefer skills with version history and user reviews. The ClawHub safe install playbook has a checklist.
Evidence and Methodology
This article draws on 15+ published sources: EarlyCore's independent security tests (629 test cases), OWASP's LLM top 10, Vectra AI's prompt injection analysis, Kaspersky, JFrog, Snyk, Cyera Research Labs, and OpenClaw's own docs. I don't have access to OpenClaw's internal security team. All claims link to their sources. Where data was partial or contested, I've said so.
Related Resources
- How to Use openclaw security audit (And Actually Fix What It Finds)
- The Proven OpenClaw Security Hardening Playbook
- Stop! 7 Proven OpenClaw Security Fixes That Save Your Agent
- ClawHub Skills: How to Install Without Getting Compromised
- Docker OpenClaw Done Right: Compose Template, Persistent Volumes, and Pairing
Changelog
- 2026-02-26: Initial publication
- 2026-02-26: Added "What This Article Is Not Claiming" section. Softened adversarial tone in safety tools comparison. Clarified what TLA+ formal verification proves vs. doesn't. Strengthened prompt injection rebuttal by acknowledging the category danger is fair criticism.
Enjoyed this post?
Get new articles delivered to your inbox. No spam, unsubscribe anytime.
Related Posts
Feb 26, 2026
Docker OpenClaw Done Right: Compose Template, Persistent Volumes, and Pairing
Run OpenClaw in Docker with a tested docker-compose.yml, persistent volumes, localhost-only ports, and the complete pairing workflow.
Feb 26, 2026
OpenClaw Pairing Explained: Why Your Bot Ignores You (And How to Fix It)
OpenClaw pairing controls who talks to your bot. Learn dmPolicy, allowFrom, groupPolicy, and how to fix every silent bot scenario.
Feb 25, 2026
ClawHub Skills: How to Install Without Getting Compromised
341 malicious skills were discovered on ClawHub in February 2026 distributing macOS infostealing malware. Here's exactly how to review any skill before you install it.

Comments