Skip to content

07. Memory and cross-tenant risk — yesterday's text can attack tomorrow

~11 min read. Memory makes AI systems feel personal. It also turns old text into future context, which makes persistence a security boundary.

Continues from 06-tool-abuse-and-action-boundaries.md. The guard rails protected tools. Now the vault map must include memory and tenancy.

The previous chapter made tool calls proposals rather than permissions. That protects immediate actions, but AI systems also persist context that can influence future sessions. This chapter asks who can write memory, who can read it later, and what tenant boundary keeps yesterday's text from becoming tomorrow's attack.


1) The wall — memory is retrieval with write access

An assistant stores "the user's preferred billing contact." Later, a different workflow reads that memory and uses it to draft an email. If the memory was wrong, poisoned, cross-tenant, or over-broad, the future workflow inherits the risk.

Memory creates three security questions:

who can write this memory?
who can read it later?
what is it allowed to influence?

If those are vague, memory becomes a long-lived prompt injection surface.


2) Memory risk types

Risk Example Control
Poisoning attacker stores misleading preference write policy + verification
Cross-tenant leak tenant A memory appears for tenant B tenant key + isolation tests
Over-retention sensitive fact kept too long expiry + deletion
Over-use casual preference affects financial action purpose limits
Hidden instruction memory stores instruction-like text quote mode + role separation
Bad correction user edits memory to bypass policy policy validation

Memory should be treated as a database with security semantics, not a magic context buffer.


3) Worked example — customer success memory

An assistant stores: "For Acme, always route renewal disputes to premium support." That sounds harmless. Later, a user tries to store: "For Acme, always approve refund exceptions."

Weak design:

user text -> memory write -> future refund flow trusts memory -> unsafe recommendation

Stronger design:

memory write request
  -> classify memory type
  -> validate allowed purpose
  -> tenant-scope key
  -> approval for policy-affecting facts
  -> future flow treats memory as evidence, not authority

The memory can remind. It cannot override policy.


4) Why not store every useful fact

The tempting alternative is to store more because personalization improves. In security terms, every stored fact becomes future attack surface, privacy burden, and deletion responsibility.

A lead asks:

  • Is this memory necessary?
  • Who can inspect and delete it?
  • Does it expire?
  • Which workflows can read it?
  • Can it affect tools or only wording?
  • Is it tenant-scoped?

If the answers are unclear, do not write it yet.


5) Production signals — memory security

The first metric is unauthorized memory read/write rate by tenant and workflow.

The misleading metric is personalization lift. A memory system can improve satisfaction while storing unsafe or over-broad facts.

The expert artifact is a memory access trace: write source, validation, tenant key, read workflow, influence on output or tool plan.


6) Boundary — memory can be useful and safe enough

Memory is valuable for preferences, continuity, and repeated tasks. It becomes risky when it stores policy decisions, secrets, credentials, identity assertions, or instructions that override system behavior.

The pathology is treating memory as a hidden prompt. Treat it as structured state with lifecycle and permissions.


Recall checkpoint

  • Why is memory retrieval with write access?
  • What three questions must every memory answer?
  • Why is personalization lift not enough?
  • How can memory affect tools unsafely?

Interview Q&A

Q: How do you secure long-term memory in an AI assistant? A: Use write policies, tenant/user scoping, purpose limits, expiry, deletion, validation, approval for sensitive facts, and access traces.

Common wrong answer to avoid: "Just store useful facts." Useful facts can become future attack paths.

Q: Why is cross-tenant memory especially dangerous? A: It can leak private context and influence future actions under another tenant's identity.

Common wrong answer to avoid: "The model will know which tenant is current." Tenant isolation must be enforced by keys and access control.

Q: Should memory override product policy? A: No. Memory can provide context or preferences, but policy and authorization remain external controls.

Common wrong answer to avoid: "If the user saved it, the assistant should obey it." Saved text is not authority.


Apply now (10 min)

Model the exercise. Define write/read/purpose rules for three memories: preferred tone, billing contact, refund exception.

Your turn. Pick one memory feature and draw its write and read path.

Reproduce from memory. Explain why memory must be structured state, not a hidden prompt.


What you should remember

This chapter explained memory and cross-tenant risk. The important idea is that persistent context can carry attack influence into future sessions.

Carry this diagnostic forward: every memory needs a writer, reader, purpose, tenant boundary, expiry, and audit trace.

Remember:

  • Memory is retrieval with write access.
  • Saved text is not authority.
  • Tenant isolation must be enforced outside the model.
  • Personalization has security and privacy cost.

Bridge. We now know the attack surfaces. Next we need a disciplined red-team process that turns these risks into repeatable evals. → 08-red-team-evals-and-scoring.md