← Back to Blog

Human-in-the-Loop AI Agents: When to Automate, When to Ask First

Human-in-the-Loop AI Agents: When to Automate, When to Ask First

You set up your first AI agent. It looks great in testing. You give it access to your email, your calendar, a few shell commands. You hit deploy.

Three days later it's sent a reply you'd never have written, filed a ticket in the wrong project, and pinged your client at 2 AM.

This isn't a hypothetical. It's what happens when you skip the human-in-the-loop design step. The good news: there's a simple spectrum you can apply before you hand any task to an agent.

The Automation Spectrum

Not every task should be handled the same way. Think of it as five levels:

Level 1 — Fully automated, no notification The agent acts and says nothing. Good for: file cleanup, log archiving, routine health checks with no user impact.

Level 2 — Automated with audit trail The agent acts and logs what it did. You review the log periodically. Good for: scheduled reports, data syncs, routine backup tasks.

Level 3 — Automated with exception alerts The agent acts normally, but sends you an alert when something unusual happens. Good for: price monitors, uptime checks, email triage with low-risk actions.

Level 4 — Human approval for significant actions The agent drafts the action and waits for your go/no-go before executing. Good for: sending emails, posting to social media, modifying production configs.

Level 5 — Human-driven, agent-assisted You make every decision; the agent prepares options, drafts, or summaries for you to choose from. Good for: contract reviews, client-facing responses, financial decisions.

The mistake most people make is defaulting to Level 1 for everything because it's easier to set up. Map your tasks to the right level before writing a single line of config.

How to Apply This in OpenClaw

OpenClaw gives you practical tools to implement every level of this spectrum.

HEARTBEAT.md for Levels 1–3

HEARTBEAT.md controls what your agent does on a schedule. Here's what a well-structured entry looks like:

## Task: Daily disk usage check
Schedule: every 60 minutes
Action: Check disk usage on /var/data. If usage > 85%, send me a Telegram alert.
If usage > 95%, send CRITICAL alert and stop writing new files.
Silent if below 85%.

This is Level 3 automation: routine, silent, with exception alerts. No human needed unless something breaks.

Compare it to a Level 1 entry:

## Task: Archive processed log files
Schedule: daily at 03:00
Action: Move *.processed files from /var/logs/queue/ to /var/logs/archive/.
Log moved file count to memory. No notification needed.

The key difference is whether the action has user-visible consequences. Moving log files at 3 AM doesn't. Sending a client email does.

AGENTS.md for Levels 4–5: Ask Before You Act

For any action that's hard to reverse or user-visible, describe the ask-first pattern in your AGENTS.md:

## Outbound Email Rule
Before sending any email that is not a pre-approved template response:
1. Draft the email and show it to me
2. Wait for explicit approval ("send it" or "looks good")
3. Only send after confirmation
4. If no response within 30 minutes, do NOT send. Notify me that the draft is waiting.

This is simple to write and easy to follow. The agent knows: draft, show, wait, confirm. It's not "smarter" to skip this step.

For calendar invites, Slack messages, or anything touching a third party, add equivalent rules.

SOUL.md for Hard Limits

Some things your agent should never do autonomously, regardless of context. Put those in SOUL.md:

## Boundaries
- Never send a message to a client or customer without explicit human approval
- Never delete files in /home/prod/ without showing me the list first
- Never make purchases, create invoices, or modify billing data
- If asked to do any of the above autonomously, decline and ask for confirmation

Boundaries in SOUL.md are the clearest signal to the model that these are non-negotiable rules. They survive context reloads, session restarts, and prompt variations.

Common Mistakes

Mistake 1: Automating everything at Level 1 to start. The "I'll add approval later" plan rarely happens. Design approval into your workflow before deployment.

Mistake 2: Writing vague approval rules. "Check with me before doing anything important" is not a rule. Specify what "important" means: external sends, deletes, config changes.

Mistake 3: Leaving the approval window open indefinitely. If your agent sends you a draft and you're on vacation, it needs an explicit timeout. "If no response within 2 hours, archive the draft and notify me" is a real instruction.

Mistake 4: Using approval for low-stakes tasks. Human approval adds latency. If you're approving every file move and log rotation, you've defeated the purpose. Reserve it for actions with consequences.

Mistake 5: Building approval in chat but not in file configs. Approval rules written in chat session memory can be forgotten when sessions restart. The only durable place is your agent's markdown files.

Security Guardrails

Least privilege applies to automation levels too. An agent running at Level 1 still needs minimal permissions. Don't give it write access to a directory it only needs to read.

Log everything your agent does autonomously. Even for Level 1 tasks, a weekly review of the action log tells you whether your rules are working or drifting.

Prompt injection is real. If your agent processes external content (emails, web pages), a bad actor can embed instructions in that content. Level 4 approval for any action triggered by external input is the safe default.

Test approval rules before deploying. Write a HEARTBEAT entry that triggers an ask-first rule, then test it manually before putting it on a schedule. Confirm the agent waits, confirms, and times out correctly.

A Practical Example: Email Triage Agent

Here's how a solopreneur might apply the spectrum to a real use case.

The task: An agent reads incoming Gmail, categorizes messages, and handles routine responses.

Action Level Rationale
Categorize as "newsletter" and archive 1 Reversible, zero user impact
File support ticket for FAQ question 2 Low-risk, logged
Flag VIP sender and summarize 3 Alert only, no action
Draft reply to new lead 4 External send, needs approval
Reply to ongoing client thread 5 Too nuanced, agent prepares options

The same agent operates at five different levels depending on the specific action. That's the point: match the level to the consequence, not to the task category.

The Honest Limitation

Human-in-the-loop adds friction. If you need sub-second responses, approval workflows won't work. Choose Level 4–5 for tasks where the cost of a mistake exceeds the cost of waiting 10 minutes for your input. For real-time decisions, invest more in rule-writing and testing so you can safely run at Level 1–2.

The goal isn't maximum automation. It's automation you can trust, with human oversight where it matters.

Build Your First Approval-Ready Agent

OpenAgents.mom guides you through the human-in-the-loop design step in the wizard. You answer questions about which actions need approval, and the output includes pre-written approval rules in AGENTS.md and SOUL.md, ready to deploy.

Generate your agent workspace on OpenAgents.mom — EUR 4.99, plain markdown files you own and control.

Design an Agent You Can Actually Trust

The OpenAgents.mom wizard walks you through approval rules, escalation paths, and automation levels — so your agent ships with human-in-the-loop controls built in from day one.

Build Your Approval-Ready Agent

Share