← Back to Blog

Why Your OpenClaw Agent Costs More Than You Think (And How to Fix It)

Why Your OpenClaw Agent Costs More Than You Think (And How to Fix It)

You set up your OpenClaw agent, ran it for a week, and then checked your API bill. The number is higher than you expected. You are not alone.

This is one of the most common complaints about continuous AI agent sessions — and it is completely fixable. The problem is not OpenClaw itself. The problem is how most people configure sessions when they first get started.

Here is what is happening and how to fix it.

Why Continuous Sessions Get Expensive Fast

Every time your agent responds to a message or runs a scheduled task, it sends a request to the LLM provider. That request includes not just the current message but the full conversation history — every prior message, tool call result, and memory entry in the session context.

A short session might carry 2,000 tokens of context. A session that has been running for five days without pruning can easily carry 80,000 tokens per request. If your agent is checking things every 15 minutes via HEARTBEAT.md, that is 96 requests per day, each with a 80k-token context window. The math hurts.

The three biggest cost drivers are:

  1. Long-running sessions without context pruning — old conversation history accumulates and gets re-sent every single turn
  2. HEARTBEAT.md polling intervals that are too aggressive — checking every 5 minutes for something that changes hourly wastes money
  3. Broad tool access that generates verbose results — a tool call that returns 10,000 tokens of raw JSON when you only need three fields bloats context fast

Fix 1: Tune Your HEARTBEAT.md Intervals

Open your agent's HEARTBEAT.md file. Most first-time configs look something like this:

## Check: Server Health
Interval: 5 minutes
Action: Ping /health endpoint and report status

Ask yourself: does this actually need to run every 5 minutes? If your deployment pipeline takes 20 minutes to complete and issues take at least 10 minutes to surface, a 15-minute interval works just as well and cuts that task's token usage by 66%.

A more sensible default for most monitoring tasks:

## Check: Server Health
Interval: 15 minutes
Action: Ping /health endpoint. Only report if status is not 200.

## Check: Disk Usage
Interval: 60 minutes
Action: Check disk usage on /var. Alert if above 80%.

## Check: SSL Expiry
Interval: 24 hours
Action: Check SSL certificate expiry on domain. Alert if under 14 days.

The Only report if status is not 200 instruction is critical. It prevents the agent from writing a verbose success message into session history on every healthy check — which would inflate context for no reason.

Fix 2: Scope Sessions Tightly

A single agent doing everything is the most expensive pattern. One session handles customer support AND deployment monitoring AND daily digest AND email triage. That session accumulates context from all four jobs simultaneously.

The fix is to give each agent a single, scoped job. A monitoring agent watches your servers. A support agent handles user questions. A digest agent runs once per day and exits cleanly.

In practice, this means keeping your AGENTS.md tight:

## Scope
This agent monitors server health, disk usage, and SSL expiry.
It does NOT handle customer support, email, or code reviews.
When asked to do something outside this scope, decline politely and explain your role.

That boundary clause prevents context bleed from off-topic conversations that would otherwise drag unrelated tokens into every subsequent request.

Fix 3: Prune Context Deliberately

OpenClaw's memory system gives you tools to manage what stays in long-term context and what gets dropped. Most agents never use them intentionally.

In your AGENTS.md, add explicit pruning rules:

## Memory Rules
- Daily notes older than 7 days: summarize into one line, then delete the original
- Tool call results: never keep raw output longer than 24 hours
- Resolved alerts: mark as resolved and exclude from active context after 2 hours
- Session history: summarize completed tasks weekly, do not retain step-by-step logs

This keeps your MEMORY.md useful for recall while preventing it from becoming a token dump.

You can also add a daily cleanup task to HEARTBEAT.md:

## Task: Memory Cleanup
Interval: 24 hours (runs at 03:00)
Action: Review memory/YYYY-MM-DD.md files older than 7 days. Summarize each into a single line in MEMORY.md under "## Past Weeks". Delete the daily file after summarizing.

That one task can prevent your agent's effective context from growing indefinitely.

Fix 4: Filter Tool Output Before It Hits the Context

When your agent calls a tool that returns a large payload — a JSON API response, a full webpage scrape, a long log file — the entire result lands in context. On the next request, it all goes back to the LLM.

The most effective pattern is instruction-level filtering in AGENTS.md:

## Tool Use Rules
- When reading API responses, extract only the fields needed for the task. Do not quote full JSON in responses.
- When checking log files, summarize the last 10 lines only. Do not paste full log output.
- When browsing web pages, extract the key points only. Do not quote entire articles.

These instructions tell the agent to process and discard, not process and retain. Your memory stays lean.

What a Lean Agent Config Looks Like

Here is a minimal monitoring agent configured for cost efficiency:

SOUL.md (excerpt)

You monitor server health, disk, and SSL expiry for one project.
You run quietly. You only speak when something needs attention.
You never store full tool outputs — you summarize and discard.

HEARTBEAT.md

## Check: API Health
Interval: 15 minutes
Action: GET /health. If status != 200, send Telegram alert. Otherwise, log nothing.

## Check: Disk
Interval: 60 minutes
Action: Check /var disk usage. Alert if > 80%. Otherwise, log nothing.

## Check: SSL
Interval: 24 hours
Action: Check SSL expiry. Alert if < 14 days. Otherwise, log nothing.

## Task: Memory Cleanup
Interval: 24 hours (03:00)
Action: Summarize and prune daily notes older than 7 days.

This agent runs 24/7 but only generates meaningful tokens when something is actually wrong.

How Much Can You Actually Save?

The numbers vary by model and usage pattern, but here is a realistic scenario:

Config Requests/day Avg context tokens Daily token cost (estimate)
Untuned (5-min intervals, no pruning) 288 40,000 ~11.5M tokens
Tuned (15-min intervals, pruning, filtering) 96 8,000 ~768K tokens

That is roughly a 15x reduction for a single monitoring agent, with zero reduction in actual usefulness.


Common Mistakes

  • Setting all HEARTBEAT intervals to 5 minutes — most checks don't need that frequency; match the interval to how fast the underlying condition actually changes
  • Letting raw tool output accumulate in memory — API responses and log dumps are expensive to carry; always filter and summarize
  • Running one agent for everything — scoped agents with narrow contexts are cheaper and easier to debug than single generalist agents
  • No pruning rules in AGENTS.md — without explicit instructions, the agent will retain everything indefinitely
  • Verbose success messages from healthy checks — tell your agent to stay silent on success; only write to context when something needs attention

Security Guardrails

  • Never put API credentials, tokens, or passwords directly into HEARTBEAT.md or AGENTS.md — use environment variable references and keep secrets out of your agent workspace files
  • When scoping tool access for monitoring agents, apply the least-privilege principle: a health-check agent does not need file write access or exec permissions beyond what its specific checks require
  • Review your agent's MEMORY.md before running memory cleanup tasks — automated pruning should not delete context that contains unresolved security incidents

Start With One Change Today

If your bill is higher than expected, start with HEARTBEAT.md. Open the file, identify every interval under 15 minutes, and ask whether the underlying signal actually changes that fast. For most checks, it does not. Push them to 15 or 60 minutes.

That one change will cut your token usage significantly without any impact on your agent's actual reliability.

If you have not built your agent yet, build it right from the start.

Build a Cost-Efficient Agent From Day One

OpenAgents.mom generates workspace bundles with sensible HEARTBEAT defaults, scoped AGENTS.md instructions, and memory rules that keep context lean -- so you skip the expensive trial-and-error phase.

Generate Your Agent Workspace

Share