Sandboxed Code Execution for AI Agents

Why Code Execution Matters

An AI agent that can only chat is not really an agent. The power of OpenClaw — and what separates it from ChatGPT or Claude — is that it can actually do things. Analyze a CSV with Python. Scrape data from a website with Playwright. Run a Node.js script to process an API response. Generate charts from raw data.

But running arbitrary code is risky. Without proper isolation, a code execution bug or a malicious skill could access your filesystem, leak data, or compromise your system. This is not a theoretical risk — it is why OpenClaw's sandbox feature exists.

How the Sandbox Works

KiwiClaw uses Podman containers for sandboxed code execution. Podman is a daemonless container engine that runs without root privileges — it is the right tool for cloud VMs where Docker's daemon model does not work.

When your agent needs to run code, it spins up an isolated Podman container. The container has:

No access to the host filesystem — the container sees only its own temporary filesystem.
No network access to other tenants — the container is isolated from the rest of the Fly.io machine pool.
Resource limits — CPU and memory caps prevent runaway processes from affecting your agent's performance.
Automatic cleanup — containers are destroyed after execution. No persistent state leaks between sessions.

The agent writes code, runs it in the sandbox, reads the output, and continues the conversation. From the user's perspective, it looks like the agent just "ran some code." Behind the scenes, that code executed in a disposable, isolated environment.

What the Agent Can Do in the Sandbox

Data analysis — Upload a CSV, Excel file, or JSON. The agent writes Python scripts with pandas, numpy, or matplotlib to analyze, visualize, and summarize your data.
Web scraping — The agent uses Playwright to browse websites, extract data, take screenshots, and automate multi-step web interactions.
Script execution — Run Python, Node.js, or shell scripts. The agent can install packages (pip, npm), run multi-file projects, and process complex workflows.
File processing — Parse PDFs, convert file formats, extract text from images (OCR), and generate formatted documents.
Testing and debugging — Paste code and ask the agent to test it. It runs the code, identifies bugs, fixes them, and verifies the fix — all in the sandbox.

KiwiClaw vs Self-Hosted Sandbox

Self-hosting OpenClaw with sandbox enabled requires significant setup. You need to install Docker or Podman on your server, configure security policies, set up networking isolation, manage container images, and troubleshoot compatibility issues.

On KiwiClaw's infrastructure (Fly.io Firecracker VMs), Docker is not available because Firecracker VMs do not run a Docker daemon. We use Podman, which works natively in this environment. All sandbox configuration — container setup, resource limits, cleanup — is handled automatically. You get code execution out of the box.

Security Model

The sandbox operates on a principle of least privilege. Code runs in unprivileged containers with no capabilities beyond basic execution. Filesystem access is limited to a temporary volume that is destroyed after each session. Network access is restricted to prevent lateral movement. All code execution is logged in audit trails (Enterprise).

This means even if a skill or user prompt causes the agent to execute something unexpected, the blast radius is contained to a disposable container that has no access to your data, your agent's configuration, or any other tenant.

FAQ

What is sandboxed code execution?

Sandboxed code execution means your AI agent runs code inside an isolated container (Podman) that cannot access your host system, other tenants, or the broader network. If the code does something unexpected, the damage is contained to the disposable sandbox.

What languages can the agent run in the sandbox?

The sandbox supports Python, Node.js, shell scripts, and browser automation via Playwright. The agent can install packages, run multi-file projects, process data, generate charts, and automate web interactions — all within the sandboxed environment.

How is KiwiClaw's sandbox different from self-hosted OpenClaw?

Self-hosting OpenClaw requires setting up Docker or Podman yourself, configuring security policies, and maintaining the sandbox infrastructure. On Fly.io (KiwiClaw's infrastructure), Docker is not available — we use Podman, which runs without a daemon in Firecracker VMs. KiwiClaw handles all sandbox configuration automatically.