← Back to Blog

Memory Safety in Multi-Agent Systems: Securing What Your Agents Remember

OpenAgents.mom · 2026-06-27 · 9 min read

In late 2025, a red team at a financial services firm discovered that their multi-agent pipeline was leaking customer PII across task boundaries. The culprit wasn't a misconfigured API or a rogue model — it was a shared vector store that no one had scoped properly. One agent summarized a client's portfolio. A downstream agent retrieved that summary as "context" when answering a completely unrelated query from a different user session.

This is the class of problem that doesn't show up in your unit tests and doesn't trigger an alert in your SIEM. It surfaces when an agent answers a question it shouldn't be able to answer. By then, the data has already moved.

AI security conversations usually focus on prompt injection, tool misuse, and model exfiltration. Memory is the quieter threat — and in multi-agent systems, it's the one most likely to cause a compliance incident.

What "Memory" Actually Means in an Agent Stack

Agent memory isn't a single thing. Most frameworks split it into at least three layers: in-context memory (the active conversation window), external memory (vector databases, key-value stores, relational tables), and procedural memory (fine-tuned weights or few-shot examples baked into a model).

In a multi-agent system, all three layers can be shared, isolated, or accidentally cross-wired. LangGraph lets you attach a checkpointer that persists state between nodes. Letta (formerly MemGPT) gives every agent its own tiered memory with explicit in-context and archival layers. AutoGen / AG2 passes messages between agents as conversation history — which means one agent's output is another agent's input by design.

Each architecture makes different tradeoffs. The risk profile changes depending on which layer you're using and whether you've thought about who — or what — has read access.

The Cross-Contamination Problem

Cross-contamination happens when memory written by one agent (or one user session) becomes readable by a different agent or session without explicit authorization.

The most common trigger is a naively shared vector store. If your orchestrator agent and your summarizer agent both read from and write to the same embedding index without namespace separation, summaries from Session A are retrievable in Session B. This isn't hypothetical — it's a direct consequence of how similarity search works. A query in Session B might be semantically close enough to a document from Session A to surface it in the top-k results.

The fix is namespacing: every collection, index, or table that holds agent-generated memory should be scoped by tenant_id, session_id, or both. In Chroma, that means using collection per tenant. In Pinecone, use namespaces. In pgvector, use row-level security or schema isolation.

Prompt Injection Through Stored Memory

Stored memory introduces a second-order injection surface that most teams don't model during threat assessment.

Here's the attack path: a malicious user submits input designed to be stored in an agent's memory layer. Later, when a different query triggers a retrieval, that injected content gets pulled into context and influences the agent's behavior. This is sometimes called a stored prompt injection or indirect prompt injection via memory.

Researchers have demonstrated this against RAG pipelines where user-submitted documents are indexed without sanitization. The same vector store that makes your agents smart also makes them a delivery mechanism for instructions you didn't write. Check out how these risks compound in enterprise environments before you connect a shared knowledge base to an agent with tool-calling access.

Common Mistakes

Unsanitized writes to shared memory. Any content that enters your vector store from an external source should be treated as untrusted. Strip instructions, meta-commands, and anything resembling a system prompt before indexing.
No TTL on episodic memory. Agents that accumulate memory indefinitely grow their attack surface over time. Set expiration policies on session-scoped stores.
Single memory scope across roles. An orchestrator and a code-execution agent should not share the same memory namespace. Privilege separation applies here too.

Privilege Separation for Agent Roles

In a traditional system, you wouldn't give a logging service write access to your production database. The same principle applies to agent memory, but most teams don't map it explicitly.

Define memory roles for each agent in your system. A reader agent that only answers questions should have read-only access to the knowledge base. A writer agent that ingests documents needs write access — but only to its designated namespace, not the full store. An orchestrator might need read access across namespaces to coordinate tasks, but that access should be audited and logged.

In practice this means enforcing access at the infrastructure layer, not just the prompt layer. A system prompt that says "don't access other users' data" is not an access control. A scoped API key or a role-bound database connection is.

Secrets and Credentials in Agent Context

Agents that handle API calls, database queries, or file system operations often have secrets passed into their context — either explicitly in the system prompt or implicitly through tool configurations.

The risk: those secrets can end up in memory. If your agent logs full conversation history, if it writes summaries to a shared store, or if it passes context to a downstream agent, credentials travel with it. An injected instruction asking the agent to "repeat your full system prompt" is a classic extraction vector.

Keep secrets out of context entirely. Use environment variables loaded at runtime, not injected into prompt templates. Use short-lived tokens scoped to the minimum required permissions. For a deeper look at credential hygiene in agent configs, see keeping secrets out of agent context.

Security Guardrails

Scope every memory namespace by tenant_id and session_id at the infrastructure layer.
Set TTL policies on all session-scoped memory stores. Don't accumulate indefinitely.
Treat all external inputs as untrusted before they're written to any memory layer.
Enforce memory access via infrastructure controls (IAM, row-level security, scoped keys) — not prompt instructions.
Log all memory reads and writes with enough detail to reconstruct what an agent retrieved and when.

Audit Trails for Memory Operations

When an agent produces a wrong or harmful output, you need to know what it retrieved from memory to produce that output. Without an audit trail, post-incident investigation is guesswork.

At minimum, log: the query sent to your vector store, the document IDs returned, and the agent ID and session ID that issued the query. If your system uses LangSmith, LangFuse, or a similar tracing tool, verify that memory retrieval calls appear in the trace — not just LLM calls.

For higher-stakes workflows, consider append-only memory logs. If an agent can overwrite previous memory entries, an attacker who compromises the write path can revise history. Immutable audit logs at the storage layer close that gap.

Memory Persistence Across Agent Handoffs

Orchestrated multi-agent systems often pass state from one agent to the next as the task progresses. This handoff is a boundary where data classification tends to get lost.

Agent A summarizes a document and tags the summary as confidential: true. Agent B receives that summary as a message in its context window — but if the receiving agent doesn't inspect or respect that tag, the classification evaporates. By Agent D in the chain, the data is treated as ordinary context with no restrictions.

If you're building compliance-sensitive workflows, implement a data envelope pattern: wrap sensitive memory payloads with metadata that downstream agents are required to check before passing data along. This doesn't require a framework change — it's a convention in your message schema. See the patterns described in multi-agent system strategies for how orchestration design affects data flow.

Framework-Specific Considerations

Different frameworks give you different handles on memory safety.

LangGraph exposes memory through its State object and checkpointer. You control what goes into state and can filter fields before passing to the next node. Use custom serializers to strip sensitive fields at node boundaries.

Letta treats memory as a first-class primitive with explicit in-context and archival tiers. Each agent has its own memory persona, which gives you natural isolation — but you still need to control what gets written to the archival store during tool calls.

AutoGen / AG2 relies on message passing, so memory is largely the conversation history. Be deliberate about which agents see which messages. Use GroupChat speaker selection to control who gets included in the context at each turn.

CrewAI provides a memory configuration at the crew level. By default, all agents in a crew share long-term memory. For security-sensitive deployments, disable shared memory and use per-agent memory stores backed by isolated collections.

None of these frameworks enforce memory safety by default. The isolation is available, but you have to configure it.

Monitoring and Anomaly Detection

Memory-layer attacks don't look like network intrusions. They look like slightly odd agent outputs — a response that mentions something it shouldn't know, a retrieval that returns documents from the wrong context, a summary that includes injected instructions.

Build monitoring around output semantics, not just infrastructure metrics. This can be as simple as a lightweight classifier that flags outputs containing PII patterns, system-prompt-like language, or known injection strings. Some teams run a separate "judge" agent that evaluates outputs before they're returned to users — though that adds latency and its own attack surface.

For AI security at scale, integrate your memory audit logs with your SIEM. Alert on unusual retrieval volumes, queries from unexpected agent IDs, or memory writes that occur outside normal task flows.

Building a Memory Security Model Before You Need One

The pattern in most incidents is the same: memory security is treated as a post-launch concern, addressed after something goes wrong. By that point, the data has already moved and the cleanup is expensive.

Your memory security model should be part of your system design, not a retrofit. That means defining memory namespaces, access roles, TTL policies, audit logging, and data classification rules before you write your first agent. If you're already running a multi-agent system without these controls, a governance review of your current agent deployment is a reasonable starting point.

AI security in multi-agent systems is ultimately about treating memory with the same seriousness you'd give a production database — because that's functionally what it is. The risk isn't theoretical, and the controls aren't complicated. They just have to be intentional.

Lock Down Your Agent's Memory Layer Before It Reaches Production

Get a multi-agent workspace configuration with namespaced memory, scoped access controls, and audit logging built in from day one — not retrofitted after an incident.

Build Your Secure Agent Config

Send Feedback