In early 2025, a financial services firm discovered that one of their LangChain-based research agents had been exfiltrating summaries of internal M&A documents to an external API endpoint — not because of a bad actor, but because the agent's tool permissions were never scoped. The agent was doing exactly what it was configured to do. Nobody had thought to ask what it shouldn't be able to do.
That story isn't unique. As AI agents move from sandbox experiments into production workflows, the attack surface expands in ways that traditional endpoint or network security wasn't built to handle. AI agent security isn't just about locking down the model — it's about controlling what the agent can reach, remember, decide, and act on.
This post covers the protocols and controls that matter now, across frameworks like LangGraph, AutoGen, CrewAI, and MCP-based stacks. No single vendor pitch. Just what's actually in use at the security-conscious end of the enterprise spectrum.
Why Traditional Security Controls Miss the Agent Layer
Most enterprise security stacks assume the software layer executes deterministic code. Firewalls, DLP, and IAM were designed for systems where a human wrote explicit logic deciding what happens next.
Agents break that assumption. They plan, branch, call external tools, and persist context across sessions — often in ways the original developer didn't anticipate. A single prompt injection buried in a third-party API response can redirect an agent's entire tool-use chain.
The threat model has to expand. You're no longer just auditing the code; you're auditing the behavior of a system that generates its own instructions at runtime.
The MCP Attack Surface Is Real and Underestimated
Model Context Protocol (MCP) has become the default connector layer for agent tooling in 2026. That's useful for interoperability. It's also a new lateral movement vector if you're not careful.
When an agent connects to an MCP server, it's being granted tool access at runtime. If that server is compromised — or if its manifests are manipulated — the agent can be redirected to call endpoints it was never supposed to reach. This is sometimes called tool manifest poisoning.
The control here is explicit MCP server allowlisting and signed manifests. Your agent runtime should refuse to load any MCP server that isn't pre-approved in your agents.md or equivalent config file. Don't let agents discover and connect to servers dynamically in production.
Prompt Injection: Still the Top Vector in Agentic Pipelines
Prompt injection in single-turn chat is annoying. In an agentic pipeline with memory, tool access, and multi-step planning, it's a critical vulnerability.
The attack pattern: malicious content in a document, web page, email, or API response instructs the agent to take a different action than intended. Because the agent is operating autonomously, there's no human in the loop to catch the redirection before it executes.
Mitigation requires multiple layers. First, treat all external content as untrusted input — strip it, quote it, or pass it through a separate validation step before it enters the agent's planning context. Second, enforce tool-call auditing: log every tool invocation with its triggering context so you can reconstruct what happened. Third, implement hard stops for sensitive operations — file writes, API calls to external endpoints, and database mutations should require explicit confirmation steps, not just agent intent.
See also: Navigating AI Security Risks for a broader breakdown of injection patterns across frameworks.
Least-Privilege Tool Scoping Is Non-Negotiable
Every agent framework gives you some mechanism to define which tools an agent can call. Most developers give agents access to everything available and tune it down later. That's the wrong order.
Start with zero tool access and add only what the agent needs for its specific workflow. In LangGraph, this means defining your tool list explicitly per agent node. In CrewAI, it means scoping tools per Agent instantiation. In AutoGen, it means restricting the function map exposed to each agent role.
For MCP-based stacks, your AGENTS.md file should enumerate allowed servers and the specific tools within each server. A document-reading agent has no business calling a write_file or send_email tool.
Security Guardrails
- Deny-by-default tool access. Start with no tools and add only what the agent's specific task requires.
- Separate read and write agents. If an agent only needs to read data, don't give it write tools — even if your framework makes it easy to bundle them.
- Audit tool calls at the infra layer. Don't rely solely on the agent runtime's own logging. Proxy tool calls through a layer you control.
- Rotate credentials on a schedule. Agents holding long-lived API keys are a liability. Prefer short-lived tokens scoped to the session.
Memory Isolation and Cross-Agent Data Leakage
Multi-agent systems introduce a problem that single-agent setups don't have: shared memory stores that can leak context across agent boundaries.
If your orchestrator passes a summarized context to a subagent without scrubbing it, sensitive data from one workflow can contaminate another. In frameworks like Letta (formerly MemGPT), where memory is persistent and managed explicitly, this is easier to control. In setups where agents share a vector store or Redis cache, it's not.
The fix requires explicit memory segmentation. Each agent or workflow session should write to a namespaced memory partition. Access between partitions should require an explicit join step — not just proximity in the embedding space. For regulated environments (finance, healthcare), treat cross-agent memory reads like cross-system data transfers and log them accordingly.
For more on memory architecture in multi-agent systems, see Memory Safety in Multi-Agent Systems.
Identity and Attribution: Who Did the Agent Act As?
When an agent sends an email, writes a file, or calls a third-party API, whose identity did it use? In most current deployments, the answer is "the service account with the broadest permissions available." That's an audit trail problem and an accountability problem.
Agent identity needs to be treated as a first-class security concern. Each agent should have its own service identity with its own scoped permissions, and every external action should be attributable to that identity. This matters for SOC investigations, compliance audits, and incident response.
In practice, this means separate API keys per agent role (not per deployment), IAM roles scoped to agent function, and action logs that capture the agent identity alongside the tool call. Don't log "the system did X" — log "agent invoice-processor-v2 called POST /api/payments at 14:32:07 UTC."
Governance Frameworks for Agentic AI: What's Actually Enforced
Regulatory pressure on AI systems has accelerated. NIST AI RMF 2.0, the EU AI Act's provisions on high-risk AI systems, and sector-specific guidance from financial regulators all touch on agentic behavior. But most of the compliance conversation still focuses on models, not agents.
Enterprise IT leaders need to close that gap now. Practically, this means:
- Behavioral specs in version control. Your
SOUL.mdorAGENTS.mdfiles should be committed, reviewed, and change-managed like any other system config. If the agent's behavioral constraints can be changed without a pull request, you don't have a governance process. - Human-in-the-loop thresholds. Define, in writing, which categories of action require human approval before the agent executes. Financial transactions above a threshold, external communications, and data exports are the obvious starting points.
- Incident response runbooks for agent behavior. What's your process when an agent does something unexpected? Most organizations don't have one. Write it before you need it.
For a sector-specific view, AI Agent Governance in Financial Services covers the regulatory specifics that apply to financial workflows.
Common Mistakes
- Treating agent configs as ephemeral. If your agent's behavioral constraints live only in a hosted dashboard, you can't audit changes, diff versions, or enforce review processes.
- Assuming the framework handles security. LangChain, CrewAI, AutoGen — none of them ship with production-grade security defaults. You have to layer controls on top.
- Skipping the data classification step. If your agents can access data you haven't classified, you don't know what they're allowed to return, store, or send. Classify first.
- Conflating model safety with agent security. A well-aligned model can still be part of an insecure agent system. Alignment and security are different properties.
Network Egress Controls for Agent Infrastructure
Agents are prolific callers of external services. Left uncontrolled, they'll call whatever tool their config permits — including third-party APIs that haven't been security-reviewed.
Network egress allowlisting is the infrastructure-level complement to tool scoping. Your agent's network layer should only be able to reach pre-approved endpoints. This is standard in any hardened enterprise environment, but it's often skipped for agent workloads because developers treat them as "just another app server."
For self-hosted deployments, implement egress filtering at the container or VM level — not just in the agent config. Config files can be modified. Network policy enforced at the infrastructure layer is harder to bypass. If you're running on Kubernetes, network policies per namespace give you the right granularity.
Continuous Monitoring Is Different for Agent Workloads
Traditional application monitoring looks for errors, latency spikes, and resource exhaustion. For agents, you also need to monitor for behavioral drift — cases where the agent's actions diverge from its expected pattern without throwing an error.
An agent that starts calling a tool at 10x its normal rate, accessing data it hasn't touched before, or producing outputs that diverge significantly from its baseline might be compromised, misconfigured, or encountering a novel input distribution. None of those conditions necessarily produce an error code.
Build monitoring that tracks tool call frequency per agent, output distributions over time, and any first-use-of-resource events. Alert on anomalies, not just failures. This is where securing AI deployments in 2026 overlaps with traditional SOC operations — the signals are different, but the response process is the same.
Where to Start If You're Inheriting an Existing Agent Deployment
If you've inherited an agent deployment without a security review, the priority order is: tool access audit first, memory isolation second, identity attribution third, egress controls fourth.
Tool access takes priority because it defines the blast radius. An agent with misconfigured memory is a data governance problem. An agent with over-scoped tool access is a breach waiting to happen.
For most enterprise environments, the practical gap isn't knowledge — it's process. AI agent security controls exist at the framework, infrastructure, and governance layers. The work is connecting them into a coherent policy that survives a personnel change, a framework upgrade, or an incident.
The frameworks will keep changing. The threat model will expand as agents get more capable. The organizations that build auditable, version-controlled, least-privilege agent configs now will be in a much better position when regulators and incident responders come asking questions.
Get Your Agent's Permissions and Behavioral Spec Locked Down Before Audit Season
Answer a few questions about your current agent stack and we'll generate a security-hardened configuration with scoped tool access, behavioral constraints, and an audit-ready setup.