← Back to Blog

Your OpenClaw Agent Can Burn $300/Day Without These Cost Guards

OpenAgents.mom · 2026-04-08 · 7 min read

Jason Calacanis published his bill last month: $300/day per Claude agent, running at 10–20% capacity. Annualized, that's roughly $109,000 per agent per year. And that's a careful operator who knows what he's doing.

Reddit tells a different story. Users reporting 5,000 API calls in a single session before they noticed. Entire monthly credit balances wiped in one afternoon. One developer whose agent entered an infinite tool loop over a weekend and returned to a $1,200 bill on Monday morning.

Unguarded agents don't just cost money. They cost your trust in the whole premise.

The good news: every one of those incidents was preventable with configs that take about 15 minutes to set up. Here's exactly what to add.

Why Agents Go Expensive

Before the configs, it helps to understand the failure modes.

Context window inflation. Every tool call appends its output to the context. An agent doing web research can balloon from 8,000 to 800,000 tokens in a single session if nothing stops it. At Claude Sonnet pricing, that session can cost $15–25 alone.

Infinite loops. Agents that can't complete a task sometimes retry it indefinitely. Without a step cap, one stuck task becomes thousands of API calls. Most model APIs don't warn you until it's too late.

Unnecessary model use. Routing every request, including simple file reads and status checks, through an expensive frontier model is the equivalent of using a Formula 1 car to pick up groceries.

No HITL gates. An agent that can autonomously spawn sub-agents, execute code, or trigger external services can amplify its own cost with every autonomous decision.

The Core Config: max_steps

The single most important cost guard is a hard step limit. In your AGENTS.md, add:

## Execution Limits

- max_steps: 25
- On hitting the limit: stop, summarize progress, and notify the user before continuing.
- Never spawn sub-agents without explicit user approval.

max_steps: 25 gives the agent enough room to complete most real tasks while making runaway loops physically impossible. Adjust based on your use case — a research agent might need 50, a simple email responder needs 10.

The notification instruction matters. An agent that silently hits the limit and terminates leaves you guessing. An agent that summarizes and asks is useful.

HITL Gates: Where to Put Them

Human-in-the-loop checkpoints are cost guards as much as safety guards. Every autonomous action the agent takes costs tokens. Every HITL gate costs you a moment of attention.

Put gates before anything expensive or irreversible. If you're new to HITL configuration, the OpenClaw security checklist covers the full approval gate setup:

## Approval Required

The agent MUST ask before:
- Spawning a sub-agent or parallel task
- Making more than 3 consecutive tool calls without showing output
- Executing any shell command
- Sending any message to an external channel
- Performing a web search loop (more than 2 searches per task)

This doesn't make the agent slow. For routine tasks where these conditions don't apply, it runs without interruption. The gate only fires when the agent is about to do something expensive.

Model Routing: Stop Paying Frontier Prices for Cheap Tasks

Not everything your agent does needs Claude Sonnet or GPT-4. A well-configured AGENTS.md routes different task types to the right model:

## Model Routing

- Simple classification, yes/no decisions, short summaries: use claude-haiku-3-5 or gpt-4o-mini
- Drafting, reasoning, research synthesis: use claude-sonnet-4-5
- Default to the cheaper model and escalate only when the task requires it

The cost difference is significant. Claude Haiku is roughly 15x cheaper per token than Claude Sonnet. Routing even 60% of your agent's work to a cheaper model can cut the monthly bill by half.

If you're running locally with Ollama after Anthropic's April cutoff, Gemma 4 handles classification and simple tasks at zero marginal cost. We covered the full cost picture in OpenClaw agent token cost — the numbers there still hold for the routing logic. Reserve the cloud model for tasks that actually need it.

Context Pruning: Stop Feeding the Window

Long-running agents accumulate context. Without pruning, a week-old memory file gets read into every session, padding the context window with information the agent doesn't need right now.

In your AGENTS.md:

## Memory Loading

- Load today's memory file only: memory/YYYY-MM-DD.md
- Load MEMORY.md only for tasks that require long-term context
- Do NOT load the full memory/ directory unless explicitly asked
- Truncate tool outputs longer than 2,000 characters before appending to context

The truncation rule is particularly important for agents that do web research or read large files. A tool that returns 50,000 characters of HTML when you needed 200 characters of content is a cost multiplier hiding in plain sight.

Tool Allowlists: Limit What Can Happen

Every tool your agent has access to is a potential cost vector. An agent with unrestricted browser access, shell exec, and API calls can compound its own spend with every step.

Tighten the allowlist to what the agent actually needs. This is the same approach covered in depth in OpenClaw's filesystem sandbox guide:

## Tool Permissions

Allowed tools:
- read (file read, no execution)
- web_search (max 2 calls per task)
- message (send only to configured channels)

Explicitly NOT allowed:
- exec (shell execution)
- browser (unrestricted web automation)
- image_generate (call only on explicit user request)
- sessions_spawn (sub-agent spawning requires approval)

An agent that can only read files and search the web has a bounded cost ceiling. Add tools deliberately, not by default.

Common Mistakes

No step cap in AGENTS.md. The agent has no stopping condition for stuck tasks and loops indefinitely. Add max_steps: 25 as a baseline.
Reading full memory/ directory on every session. Loading 30 daily memory files into context costs as much as 3-4 extra API calls per session. Load only what's needed.
All tasks routed to the flagship model. Simple classification at Sonnet prices is expensive. Route cheap work to cheap models.
No HITL gate before sub-agent spawning. Each sub-agent multiplies costs independently. Gate spawning behind explicit approval.
Unrestricted image generation. One batch of image gen calls can cost more than an entire day of text work. Restrict to explicit user requests only.

Security Guardrails

Never store API keys in SOUL.md, AGENTS.md, or any workspace file. Use environment variables or your server's secret management.
HITL gates aren't just cost controls — they're your audit trail. Review what your agent asked permission for before approving.
If an agent hits its step limit, investigate why before raising it. A limit that keeps triggering is a symptom.
Scope the tool allowlist to the minimum needed. An agent that cannot run shell commands cannot accidentally delete files, regardless of what a prompt injection tells it.

Setting a Hard Spend Alert

Config-layer controls prevent most runaway spend. But they're not a substitute for billing alerts.

Set a daily spend alert at your provider for 20% of your expected monthly budget. If Claude or OpenRouter bills you more than that in a single day, something is wrong and you want to know before it compounds overnight.

For OpenClaw deployments, track actual session costs in your daily manifest file. Beyond the demo: OpenClaw agent reliability covers the monitoring patterns that complement these cost controls. Even rough logging (model used, approximate tokens per session) gives you enough signal to catch drift before it becomes a crisis.

What "Guarded" Looks Like in Practice

An agent with these controls in place has a predictable cost floor and ceiling.

A well-configured research agent running 5 sessions per day on Claude Sonnet with Haiku routing for cheap steps and a 25-step cap should cost $3–8/day for real work. The same agent without controls, hitting a bad task on a Friday afternoon, can run $80–300 before you notice.

The configs above take 15 minutes to add to an existing AGENTS.md. The ROI is immediate.

Get Cost-Guarded Configs Built In From Day One

Every workspace bundle from OpenAgents.mom includes max_steps limits, HITL gate definitions, model routing guidance, and tool allowlists pre-configured for your agent's specific use case. You answer the interview, we wire the guardrails.

Build Your Cost-Guarded Agent

Send Feedback