Security and Sandboxing
An autonomous AI agent that can execute code, call APIs, and manage files introduces real security risks. OpenClaw provides multiple layers of protection—Docker sandboxing, execution approvals, access controls, and security auditing. In this lesson you will learn what can go wrong and how to harden your setup.
Core Risks
Before configuring defenses, understand what you are defending against:
| Risk | Description |
|---|---|
| Prompt injection | A malicious message tricks the agent into executing unintended actions |
| Skill supply chain | A community skill contains hidden malware or data exfiltration code |
| Tool misuse | The model calls a dangerous tool (e.g., rm -rf /) by mistake or manipulation |
| Unauthorized access | Someone messages your agent and gets it to perform actions on your behalf |
| Data leakage | Sensitive information from memory or files is exposed through a channel |
Cisco's research found that 26% of 31,000 analyzed agent skills contained vulnerabilities including command injection, data exfiltration, and prompt injection attacks. Security is not optional.
Docker Sandbox
The most important security measure is running tools inside a Docker container. This creates an isolated environment where a misbehaving skill cannot damage your host system.
OpenClaw supports two sandboxing approaches:
Full Containerization
Run the entire Gateway inside Docker:
docker run -d \
--name openclaw \
-v ~/.openclaw:/workspace \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
openclaw/openclaw:latest
This gives you a complete container boundary with an isolated filesystem, restricted network access, and CPU/memory limits.
Per-Tool Sandbox
Alternatively, run only tool executions in Docker while keeping the Gateway on your host:
{
"sandbox": {
"mode": "all",
"scope": "agent",
"workspaceAccess": "ro"
}
}
The workspaceAccess setting controls what the sandboxed tool can see:
| Value | Meaning |
|---|---|
"none" | No access to workspace files |
"ro" | Read-only access (recommended) |
"rw" | Read-write access (use with caution) |
For most setups, "ro" is the right balance—skills can read your files to answer questions but cannot modify or delete them.
Execution Approvals
Execution approvals require your confirmation before the agent runs certain commands. This is your last line of defense before a tool actually executes.
Configure approvals in Settings or via exec-approvals.json:
| Mode | Behavior |
|---|---|
security | Blocks high-risk commands (file deletion, network calls) unless approved |
ask | Prompts you for approval on every tool execution |
allowlist | Only pre-approved commands can execute without prompting |
For beginners, ask mode is safest—you see exactly what the agent wants to do before it happens. As you build trust, switch to security or allowlist for a smoother experience.
When an approval is needed, the agent sends you a message like:
I'd like to run: git push origin main
Allow? (yes/no)
You reply "yes" or "no" directly in your messaging platform.
Access Control
OpenClaw provides four DM access modes that control who can message your agent:
| Mode | How It Works |
|---|---|
| Pairing (default) | Time-limited pairing codes; most secure for new setups |
| Allowlist | Only specified user IDs can interact |
| Open | Anyone can message (requires explicit opt-in) |
| Disabled | DMs are turned off completely |
For group channels (Discord servers, Slack workspaces), you can set per-group policies and multi-user DM isolation so conversations stay private.
Recommendation: Start with Pairing mode. Add trusted users to the Allowlist as needed. Never use Open mode unless you specifically want a public-facing agent.
Security Audit CLI
OpenClaw includes a built-in security audit tool:
# Standard security check
openclaw security audit
# Deep audit with live Gateway probing
openclaw security audit --deep
# Automatically apply safe guardrails
openclaw security audit --fix
The audit checks for:
- Insecure sandbox configuration
- Missing execution approvals
- Overly permissive access controls
- Known vulnerable skills
- Exposed API keys in configuration files
Run openclaw security audit after every configuration change and periodically as part of your maintenance routine.
Cisco AI Skill Scanner
Cisco released a dedicated scanner for OpenClaw skills that detects:
- Command injection — skills that can execute arbitrary system commands
- Data exfiltration — skills that send your data to external servers
- Prompt injection — skills with instructions designed to hijack agent behavior
You can scan skills before installing them:
# Scan a skill directory
cisco-skill-scanner ./skills/suspicious-skill/
# Scan all installed skills
cisco-skill-scanner ~/.openclaw/skills/
ClawHub also integrates VirusTotal scanning automatically. Skills are continuously rechecked and tagged as approved, warned, or blocked.
Model Choice Matters
Your choice of AI model affects security. The OpenClaw documentation recommends:
Prefer Anthropic Opus 4.6 (or the latest Opus) because it is strong at recognizing prompt injections. Smaller or cheaper models are more susceptible to tool misuse and instruction hijacking.
If you use a budget model for cost savings, pair it with stricter sandbox settings and execution approvals.
Hardening Checklist
Here is a practical checklist to secure your OpenClaw installation:
- Enable Docker sandbox with
workspaceAccess: "ro"at minimum - Set execution approvals to
askorsecuritymode - Use Pairing or Allowlist access control
- Run
openclaw security audit --deepafter setup - Only install skills from ClawHub with an "approved" verdict
- Scan third-party skills with Cisco's scanner before installing
- Use a strong model (Opus-class) for primary agent tasks
- Review JSONL transcripts periodically for unexpected tool calls
- Back up your workspace directory regularly
- Keep OpenClaw updated (
npm update -g openclaw@latest)
Key Takeaway
Security is not optional when running an autonomous AI agent. OpenClaw provides Docker sandboxing, execution approvals, access controls, and audit tools. The biggest risks come from malicious skills and prompt injection attacks. Use the hardening checklist above, start with restrictive settings, and loosen them only as you build trust. In the next lesson, you will put everything together by building your first practical workflows.

