05. Filesystem and network policy¶

~8 min read. The resource layer bounds how much a tool consumes. The policy layer bounds where it reaches — filesystem paths, network endpoints, system calls. Most production tool incidents land here: the tool reaches further than its purpose required.

Continues from 04-resource-limits.md. This chapter develops the policy layer. Recurring concepts in bold: filesystem policy, network policy, syscall policy, allowlist, denylist, policy envelope, least privilege.

The policy envelope is the per-tool description of what the tool may read, write, and call. Everything outside the envelope is denied by default.

The three policy dimensions¶

Dimension	What it covers	Enforcement
Filesystem	What paths the tool can read, write, execute	chroot, bind mounts, overlays, OCI mounts
Network	Which endpoints the tool can reach	iptables, network policy, egress proxy
Syscalls	Which kernel syscalls the tool can make	seccomp, AppArmor, SELinux

Together they form the policy envelope. Default-deny is the discipline; allowlists describe what is permitted; explicit denylists handle known dangerous patterns the allowlist might not cover.

Filesystem policy¶

Filesystem policy is the most common source of tool incidents — the Chennai data deletion was a filesystem policy failure. The pattern:

Per-call workspace. Each call gets a fresh, isolated directory: /workspace/<tenant>/<call-id>. Read-write. Deleted on call completion.
Read-only mounts for inputs. Reference data the tool needs (libraries, models, datasets) is mounted read-only.
Explicit allowlist. Any additional paths must be explicitly declared in the tool's manifest.
Default deny everything else. The host filesystem, other tenants' data, system files — all denied.

The mechanism varies: chroot for simple cases, bind mounts in containers, OCI mount configurations in microVMs. The principle is the same.

A common error: mounting /data or /mnt at the tool level "for convenience." The convenience is paid in incidents.

Network policy¶

Network policy is the second largest blast radius. A tool with broad network access can exfiltrate data, reach internal services it should not, or be used as a pivot point.

The pattern:

Default-deny outbound. The tool can reach nothing by default.
Explicit endpoint allowlist. The tool's manifest declares the endpoints (host:port or hostname pattern) it needs to reach.
Egress proxy. All outbound traffic goes through a proxy that enforces the allowlist; direct connections are blocked at the network layer.
No instance metadata service. The cloud provider's metadata endpoint (169.254.169.254) is blocked; this is a common credential exfiltration vector.
DNS restrictions. DNS resolution is restricted to the allowlist; arbitrary DNS exfiltration is blocked.

The egress proxy is the load-bearing piece. It logs every call attempt — allowed or denied — for audit and forensics.

Syscall policy¶

Syscall policy is the lowest layer. Even if a tool has an unexpected escape route, restricting syscalls limits what it can do.

The pattern:

seccomp profile. A curated allowlist of syscalls the tool may make. Anything else is denied (typically with a kill or signal).
AppArmor or SELinux profile. Mandatory access control rules on top of seccomp.
Capability dropping. Linux capabilities (CAP_NET_RAW, CAP_SYS_ADMIN, etc.) are dropped to the minimum the tool needs.

A useful starting point: the runc/Docker default seccomp profile, then tighten per tool. Some tools need none of mount, umount, pivot_root, kexec, init_module, delete_module; deny them all.

A worked example — the data analysis tool¶

The Hyderabad fintech's Python execution tool has the following policy envelope:

Filesystem. - /workspace/<tenant>/<call-id> — read-write, per-call. - /opt/python — read-only (Python runtime and libraries). - /opt/data/references — read-only (allowed reference datasets). - Everything else: denied.

Network. - analytics-db.internal:5432 — allowed. - s3.internal:443/datasets/<tenant> — allowed (with credential layer enforcing tenant scope). - Everything else: denied, including 169.254.169.254.

Syscalls. - Curated allowlist: standard userland syscalls; no mount, umount, pivot_root, kexec, init_module, delete_module, ptrace, keyctl, bpf. - AppArmor profile: no execution of binaries outside /opt/python and /workspace.

A shutil.rmtree('/data/workspace/temp') call now hits the filesystem policy: the path is denied. The tool returns an error; the agent sees it and the model can be re-prompted or the call can be retried with a corrected path. The incident from chapter 01 is structurally prevented.

Path traversal defence¶

A common attack pattern: the tool's input contains a path like ../../../etc/passwd. The policy must resolve paths to canonical form before checking the allowlist.

The pattern:

Resolve symlinks before checking. A symlink in the workspace that points outside is followed and then checked; if outside the allowed paths, denied.
Canonicalise paths. ../ and ./ are resolved before checking.
Reject ambiguous paths. Paths with unusual characters or encoding tricks (%2e%2e) are rejected before resolution.

Most language and OS interfaces provide canonicalisation; the discipline is to use it consistently.

Network policy escapes¶

Common ways to bypass network policy:

DNS-based exfiltration. The tool encodes data in DNS queries; DNS resolution is allowed even if direct outbound is blocked.
HTTP proxy abuse. The egress proxy is itself a routable target; the tool sends crafted requests through it.
IPv6 fall-through. Policy covers IPv4 but not IPv6; the tool connects via IPv6.
Time-of-check vs. time-of-use. A hostname resolves to one IP at check time and another at connect time.

Defences: restrict DNS to allowlist hostnames; the egress proxy is the only allowed outbound path; IPv6 is denied or symmetrically restricted; pin resolution at the proxy.

Operational signals¶

Healthy. Every tool has an explicit policy envelope. Policy denials are logged and reviewed. New tools require policy review as part of production-readiness.

First degrading metric. Policy denial rate climbing. Either the policy is too tight, the tool's behaviour has changed, or an adversarial pattern is emerging; investigation distinguishes.

Misleading metric. Number of policy rules. Many rules can be redundant or stale; the metric to watch is policy coverage (every tool has explicit policy) and denial rate trend.

Expert graph. Per-tool policy denial rate, syscall profile coverage, network egress proxy log volume and denial rate.

Boundary of applicability¶

Strong fit. Tools with explicit, narrow needs for filesystem and network access. The full policy envelope is justified.

Pathology. Tools whose needs are too broad to be policied tightly (e.g., a general-purpose "search the internet" tool). These warrant other defences — heavier isolation, more aggressive monitoring, narrower credentials — rather than wider policy.

Scale limit. Very large platforms have many tools; the policy management becomes a system in itself. Pattern: shared policy templates with per-tool deviations; central audit.

Failure-prone assumption¶

The seductive wrong belief: the model will only call paths and endpoints it knows about. Adversarial prompts can produce unexpected paths or endpoints; hallucinations produce plausible-looking ones; misconfigurations produce incorrect ones. The correct belief: policy is the structural defence against any path or endpoint outside the envelope, regardless of cause.

Where this appears in production¶

A data SaaS had a tool with /data mounted broadly; an incident wiped customer data; rebuilt with per-call workspace.
A coding assistant uses chroot per call; tools cannot reach outside their workspace.
A telecom AI has an egress proxy logging every tool network call; security team monitors for anomalies.
A fintech blocks 169.254.169.254 at the firewall; credential exfiltration via metadata service is structurally impossible.
A healthcare AI has filesystem policy reviewed quarterly by security; coverage is verified.
A retail AI has DNS restricted to allowlist; arbitrary DNS exfiltration is blocked.
A consumer chatbot had a tool that worked because the policy was wide-open; tightening produced no incidents but bounded blast radius.
A travel platform has per-tool seccomp profiles generated from a template plus tool-specific opt-ins.
A legal AI has AppArmor profiles for tools that access privileged data; mandatory access control augments seccomp.
A government AI has policy audited by external security; bypass paths are tested.
A logistics AI has the egress proxy as the only allowed outbound; direct outbound is blocked at the network layer.
A search-ops AI had a tool whose syscall profile included mount; tightened to deny; no functionality lost.
A B2B SaaS publishes a policy template; tool authors fill in deviations; the platform reviews.
A media AI caught path traversal attempts in policy denial logs; alert wired for repeated denials.
A document AI has tools with narrow read-only mounts; write access is per-call workspace only.
A staffing AI denies IPv6 outbound symmetrically with IPv4; the fall-through vector is closed.
An ad-tech AI has the egress proxy enforce hostname allowlist with DNS pinning at the proxy.
A real-estate AI has policy enforced at multiple layers (network, seccomp, AppArmor); defence-in-depth.
A medical AI has policy as a compliance artefact; regulators audit it.
A small SaaS has only filesystem policy; network policy is deferred; the deferral becomes the next incident.

Recall / checkpoint¶

Name the three policy dimensions.
What is the "per-call workspace" pattern?
Why is blocking the instance metadata service load-bearing?
How does path traversal defence work?
What are common network policy escapes, and how are they defended?
What is the typical syscall policy starting point?
Why is "the model only calls expected paths" a failure-prone assumption?

Interview Q&A¶

Q1. The team has filesystem policy via container mounts but no network policy. Walk through the gap. The tool can reach any network endpoint, including the cloud provider's metadata service (credential exfiltration), other internal services it should not (lateral movement), or arbitrary external endpoints (data exfiltration). The fix is to add network policy: default-deny outbound, explicit endpoint allowlist, egress proxy for enforcement, IMS blocked, DNS restricted. Without it, even with tight filesystem policy, the tool has a wide blast radius. Common wrong answer to avoid: "filesystem is the main thing" — network is comparably load-bearing.

Q2. Walk through the per-call workspace pattern. Each tool call gets a fresh isolated directory: /workspace/<tenant>/<call-id>. Read-write within the directory; everything outside denied. On call completion, the directory is deleted; no state persists. The pattern bounds the tool's filesystem reach to a known location and prevents cross-call leakage. Implementation: bind mount per call in containers, per-VM filesystem in microVMs, virtual filesystem in language sandboxes. Common wrong answer to avoid: "tools share a workspace" — produces cross-tenant leak and cross-call interference.

Q3. Why block the instance metadata service? The metadata service (169.254.169.254 on AWS, similar on GCP/Azure) returns the host's cloud credentials, instance ID, region, and other metadata. A compromised tool can fetch these and assume the host's full cloud identity. Defence: block 169.254.169.254 at the firewall or proxy, plus require IMDSv2 with session tokens, plus run tools with their own scoped credentials (chapter 06). Multiple layers because the consequences of a metadata-service exfiltration are severe. Common wrong answer to avoid: "tools shouldn't access metadata" — should not is not enforced; blocking is.

Q4. Walk through path traversal defence. The tool's input contains a path. Before checking the allowlist, the path is canonicalised: resolve ../ and ./ traversals, follow symlinks to their target, reject paths with unusual encoding. After canonicalisation, the path is checked against the allowlist. The discipline is to always canonicalise; many CVEs come from checks performed on uncanonicalised paths. Common wrong answer to avoid: "we'll scan for .." — naive scanning misses encoded traversals.

Q5. The egress proxy is the only allowed outbound path. Why does this matter? Direct outbound bypasses policy. If a tool can open arbitrary sockets, it can reach any IP regardless of hostname-based allowlists. The egress proxy is the chokepoint: the tool's TCP stack only sees the proxy (via routing or namespace configuration); the proxy enforces the allowlist (resolving hostnames, denying non-allowlist endpoints). Without the proxy, hostname-based policy is defeated by direct-IP connections. Common wrong answer to avoid: "hostname allowlist is enough" — only if direct outbound is structurally blocked.

Q6. How tight should syscall policy be? As tight as the tool's actual needs. Start from a denied-everything baseline; add syscalls until the tool works. Common syscalls a tool may not need: mount, umount, pivot_root, kexec, init_module, delete_module, ptrace, keyctl, bpf, setns. Denying these costs nothing for most tools and closes known escape vectors. The pattern is tool-specific; some tools legitimately need ptrace (debuggers); most do not. Common wrong answer to avoid: "use the runtime default" — defaults are conservative-but-still-broad; per-tool tightening pays off.

Design / debug exercise (10 minutes)¶

Modelled example. Walk through the worked example (the Hyderabad data analysis tool's policy envelope). Verify each policy dimension has explicit allowlist; identify any path or endpoint that could be tighter.

Your turn. Pick one tool. Write its policy envelope: filesystem paths, network endpoints, syscall allowlist. Identify any policy that is broader than the tool's actual needs.

Reproduce from memory. Write the policy envelope pattern (per-call workspace, network default-deny, syscall starting point) from memory. The signal of internalisation is that you can design an envelope for a hypothetical tool quickly.

Operational memory¶

This chapter explained the policy layer: filesystem, network, and syscall policies that compose into a per-tool envelope. The important idea is that policy is least-privilege by default — everything outside the explicit allowlist is denied — and policy is the structural defence against any path or endpoint outside the tool's purpose.

You learned to design per-call workspaces, default-deny network with egress proxy, narrow syscall profiles, and path traversal defence. That solves the opening failure because the tool's reach is now structurally bounded.

Carry this diagnostic forward: when a tool's policy envelope is broader than its purpose requires, you have found the team's next sandbox tightening.

Remember:

Three dimensions: filesystem, network, syscalls.
Per-call workspace pattern; deny everywhere else.
Block the instance metadata service; restrict DNS.
Canonicalise paths before checking allowlist.
Egress proxy is the chokepoint that makes hostname allowlist enforceable.

Bridge. Policy bounds where a tool reaches. Credentials bound what the tool can do where it reaches. The next chapter is the credential layer — scoped tokens, broker patterns, rotation. → 06-secrets-and-credentials-in-tools.md