05. PII detection and redaction¶

Scope bounds which records the agent reaches. PII discipline bounds what personal content lands in logs, prompts, responses, and storage once those records are read. Detection, redaction, hashing, minimisation — the four moves that make personal data tractable.

A privacy engineer at a Mumbai SaaS company audits the agent platform's logs for an annual review. The audit log is well-structured — purpose, scope, actor, all in place. The audit log also contains, in plaintext, ninety thousand customer email addresses, twenty thousand phone numbers, three thousand aadhaar identifiers, and a handful of credit card numbers caught in support transcripts. None of these were supposed to be in the audit log; they ended up there because the per-call audit recorded the full request payload "for investigability," and the application sometimes passed personal data in those payloads. The discipline above the audit was good; the audit itself was the leak surface.

This chapter is the discipline that prevents that audit from being a breach in itself. Four moves — detect, redact, hash, minimise — applied at every surface where personal data can land.

What PII is¶

Personally Identifiable Information is any field that can identify a specific person, alone or in combination. Categories:

Category	Examples	Treatment baseline
Direct identifiers	Name, email, phone, government ID, IP, biometric	Always redact in logs; tokenise where used
Indirect identifiers	Date of birth, address, postcode, employer	Redact in logs; combine cautiously
Sensitive attributes	Race, religion, health condition, sexual orientation, political view	Strict tier (regulated in most jurisdictions); minimise
Quasi-identifiers	Browser fingerprint, IP + timestamp, location patterns	Treat as identifiers when combined
Financial	Card number, bank account, salary	Tokenise; never log in plaintext
Behavioural	Click trails, message contents, purchase history	Audit-only retention; redact non-purpose-relevant fields

The chapter is concerned with how this content moves through the platform. The classification (chapter 02) labels the source fields; this chapter is about handling them in motion.

The four moves¶

Detect¶

You cannot redact what you cannot find. PII detection runs over data in motion — request payloads, response bodies, audit records, log lines, traces, samples for evals.

Three detection approaches, in increasing power:

Schema-driven. The field's tier label (chapter 02) declares it as PII. Detection is a lookup; redaction follows the rule.
Pattern-driven. Regular expressions for emails, phone numbers, IDs in standard formats. Catches PII appearing where it shouldn't (in a free-text response, in an error message).
Model-driven. A small classifier (or an LLM with a tight prompt) detects PII-shaped content in free text. More expensive; catches cases the patterns miss.

Most platforms run schema-driven for structured data and pattern-driven for free text, with model-driven as a sample for high-risk surfaces.

Redact¶

Once detected, the value is replaced with a marker. The markers preserve the shape of the data (so downstream consumers do not crash) and the fact that a value existed (so investigations can tell something was there).

Before: { "email": "ravi@example.com", "note": "His email is ravi@example.com" }
After:  { "email": "[REDACTED:email]", "note": "His email is [REDACTED:email]" }

Three redaction styles:

Replace with marker. [REDACTED:email]. Simple; the field is gone.
Mask preserving format. r***@example.com. Preserves more shape; useful when the redaction must remain readable for some flows.
Tokenise. [token:abc123], where abc123 is reversible only by a service the application does not have direct access to. Useful when downstream needs to correlate the same person across records without knowing who.

Redaction is at write time, not at read time. Storing the raw value and redacting on display is the leak surface.

Hash¶

Hashing is a one-way transform that preserves correlation without preserving the value. Two records about the same email produce the same hash; the hash alone reveals nothing.

email           hash                                              purpose
ravi@example.com  sha256+salt: 4a8b7e...                          correlate across audit records

Hashing is useful when:

The audit needs to identify "the same person" across records without storing the identity (e.g., aadhaar in healthcare).
The eval set needs to maintain user-level cohorts without holding raw identifiers.
Analytics needs to count uniques without retaining identifiers.

Salting is critical — an unsalted hash of a phone number is reversible (rainbow tables exist). A per-platform salt makes the hash useful only within the platform's correlation needs.

Minimise¶

The strongest PII protection is to not collect it in the first place. Minimisation is the discipline of asking, per field captured:

Do we need this for the operation?
Do we need it to be identifiable, or would aggregated form suffice?
Can we use a hash or pseudonym instead of the raw value?
How long do we need it?

Many systems carry PII out of habit or "for future analytics" without a current use. Minimisation removes those.

Where to apply the four moves¶

The discipline applies at multiple surfaces. Each surface needs its own policy.

Prompts to the model¶

The most underappreciated PII surface. What you put into the prompt enters the model provider's infrastructure, may be cached at the provider, and may be sampled for the provider's training (unless contractually excluded).

Discipline:

Schema-driven: structured PII fields go into the prompt only if the operation requires them. The model gateway can enforce minimisation by stripping unnecessary fields.
Pattern-driven: free-text user messages may contain PII. Decide per workload: redact before model? Pass through? Each has trade-offs (redaction may break the user's intent; passthrough exposes PII to the provider).
For regulated data: usually pass tokens or hashes rather than raw values; reverse-lookup happens only on the platform side after the model returns.

Audit logs¶

The chapter-opening case. Audit logs capture purpose, scope, actor, and outcome. They should not capture raw PII in fields that are not the purpose of the audit.

Discipline:

Schema-driven: known PII fields are redacted in audit. The audit records the field's presence and its type, not its value.
Pattern-driven: a final-pass redactor over the audit payload before write.
For investigative needs: a high-tier "sample store" with stricter access controls holds the full content for a small fraction of calls; the per-call audit metadata points to the sample if it exists.

Redaction-at-write is the only correct policy. Redaction-at-read leaves the raw data in storage where it can leak.

Application logs¶

Application logs (debug logs, error logs, request logs) often capture more than the audit. They are also widely accessed. Same redaction discipline applies; the bar is sometimes higher because more engineers read application logs.

Response surfaces¶

The agent's response back to the user is, ironically, the surface where some PII is intended to appear (the user's own data flowing back). But the response can also leak PII unintentionally — surfacing an internal field, echoing back content from a different record, or including PII in an error message.

Discipline:

Internal field names (addr1, lead_source) never appear in user-facing responses; the translator at the application boundary strips them.
Error messages from internal systems are translated to user-friendly text before display; the original error goes only to logs.
For multi-tenant or multi-user surfaces, output is verified against the active scope (chapter 04) — content from outside the scope should not be in the response.

Traces¶

Distributed traces capture span attributes that often include arguments, including PII. The tracing library's redaction policy is part of this discipline. Most modern tracing libraries support attribute redaction per pattern or per schema.

Eval and sample stores¶

Eval sets are small but often capture full prompt-response pairs. These need synthetic substitutes for production PII:

Real PII in eval sets is a leak waiting to happen.
Use synthetic data generators (faker-like) for personal fields when constructing eval cases.
For drift detection on real production traffic, a separate sample store with strict access controls holds the redacted (or fully captured with limited access) content.

The technical implementation¶

A reasonable redaction service:

class Redactor:
    def __init__(self, schema_labels, pattern_library, salt):
        self.schema = schema_labels
        self.patterns = pattern_library
        self.salt = salt

    def redact(self, data, context):
        # 1. Schema-driven: redact known PII fields per their tier
        for field, tier in self.schema.fields_in(data):
            if tier in ("sensitive", "regulated"):
                data[field] = self._redact_field(data[field], field, tier)

        # 2. Pattern-driven: scan free-text for PII shapes
        for path in self._free_text_paths(data):
            data[path] = self.patterns.scrub(data[path])

        # 3. Hash where correlation is needed
        for field in context.get("hash_fields", []):
            data[f"{field}_hash"] = self._hash(data.get(field, ""))
            data[field] = "[REDACTED]"

        return data

    def _redact_field(self, value, field, tier):
        if tier == "regulated":
            return f"[REDACTED:{field}]"
        return f"[REDACTED:{field}]"

    def _hash(self, value):
        return sha256((value + self.salt).encode()).hexdigest()[:16]

The redactor is called on every write to logs, audit, and any output destined for storage. It is not optional and not opt-in; the platform's logging and audit pipelines call it by default.

Common mistakes¶

Redacting at read time. Storing the raw value and redacting on display. The first storage leak exposes everything.

Trusting the schema alone. Free-text fields can contain PII regardless of their schema label. The pattern-driven layer catches what the schema misses.

Skipping the prompt surface. PII in prompts goes to the provider. Audit and log redaction without prompt redaction leaves the most-shared surface unprotected.

Reversing a hash to "see what it was." If you can reverse it, an attacker can. Salt; use a one-way function; do not retain the reverse.

Eval sets with real PII. A common shortcut. The eval set is in source control, in test environments, in CI logs — anywhere it lands is a leak. Synthetic data is required.

Not testing the redactor. A redaction library that is not tested produces silent leaks. Pact-test it (module 19 chapter 10) with known PII patterns; assert the output is redacted.

How PII interacts with the other surfaces¶

Classification (chapter 02) — the tier drives the redaction policy.
Purpose (chapter 03) — some purposes need raw PII (verifying identity); others do not (counting cohort sizes).
Scope (chapter 04) — narrower scopes naturally limit PII exposure.
Audit (chapter 07) — the audit's per-call record applies the redaction discipline.
Retention (chapter 06) — even redacted, the metadata has a retention; the original has a stricter one.
Incident response (chapter 11) — PII exposure is the high-severity incident class.

Interview Q&A¶

Q1. Why redact at write time, not at read time? Because read-time redaction leaves the raw value in storage. Any engineer with read access, any backup of the storage, any leak of the storage, exposes the raw value. Write-time redaction means the storage never sees the value; there is nothing to leak if access is compromised. Read-time redaction is a presentation policy, not a security policy. Wrong-answer notes: "performance" arguments for read-time redaction underweight the security cost.

Q2. The audit contains raw email addresses despite the redaction policy. Where did they get in? Several common paths. One: the application's request payload includes the user's free-text query, which can contain an email; the schema-driven redactor only knows about the structured email field. Two: the application logs an exception message that includes the email in the message body. Three: tracing spans include arguments with PII. The fix is a final-pass redactor over the entire audit payload (pattern-driven) plus a tracing-attribute redactor for spans. Wrong-answer notes: "we redact email fields" without the free-text pass misses the chapter-opening leak.

Q3. The team wants to put PII in the prompt to make the model's responses more personalised. How do you respond? The decision depends on the trust posture with the model provider and the regulatory regime. Three patterns. One: pass tokens/hashes instead of raw values; the model produces tokenised output; the platform resolves tokens to values before user display. Two: pass minimum-necessary PII; for example, first name only, not full name + email + phone. Three: get a contractual zero-retention agreement from the provider and pass selectively. Skipping the question entirely is the failure mode. Wrong-answer notes: "just pass it" without considering the provider's training/retention policy is the unconsidered leak.

Q4. You see an eval set in source control with real customer emails. What is the remediation? First: rotate the affected customers' notifications policies and treat as a near-miss. Second: scrub the eval set — generate synthetic substitutes for the PII; verify no real values remain. Third: enforce by CI: a check that scans the eval directory for PII patterns and fails the build if found. Fourth: pact-test the redactor on a known-PII input and assert the output is clean. The eval set should never contain real PII; the discipline is automatic substitution at the source, not vigilance after the fact. Wrong-answer notes: "we'll just be careful" produces the recurrence.

What to do differently after reading this¶

Apply the four moves (detect, redact, hash, minimise) at every surface where PII can land.
Redact at write time, always.
Pattern-scan free-text fields and tracing spans for PII; do not rely on schema alone.
Synthetic-only PII in eval sets; CI check enforces.
Test the redactor with known PII patterns; pact-test on every change.

Bridge. PII handling addresses what content lives in storage. The next discipline is how long it lives. Regulators, contracts, and customers impose limits; the discipline is retention with lawful deletion on schedule. The next chapter is retention and jurisdiction. → 06-retention-and-jurisdiction.md