02. Contract anatomy: the six fields, and what each one prevents¶
The reframe is in place. A tool is a versioned API boundary called by a non-deterministic client. This chapter answers the immediate next question: what fields does that contract actually carry, and what does each field stop from going wrong?
A platform engineer at a Pune travel-tech company spends an afternoon catalouging every tool the agent platform exposes. There are forty-three of them. Sixteen are wrapped around internal services. Twenty-one wrap vendor APIs. Six wrap database queries directly. The engineer writes a comparison table: name, parameters, return shape, side effects, owner. After three days the table is unfinishable. Eleven of the forty-three tools have no recorded owner. Nineteen have no schema beyond a Python signature. Twenty-eight have no documented side effect. None has a version. The engineer realises the inventory exercise is the first time anyone has tried to write down what these tools are. The reason is not laziness. The reason is that the platform never agreed on the minimum set of fields a tool contract must carry. People wrote whatever felt useful. Different teams wrote different things. The result is a catalogue that cannot be operated.
The remedy is unromantic. Six fields. Every tool contract carries them. Each field prevents a specific class of incident. The chapter walks them in the order you would write them if you sat down to draft a new tool today.
The six fields¶
Stamp this list. It is the minimum schema of a tool contract. Anything less is debt. Anything more is fine.
| # | Field | One-line definition | Incident class it prevents |
|---|---|---|---|
| 1 | Identity | Name, version, owner, purpose | Confusion about which tool this is, who owns it, and what version the call was built against |
| 2 | Class | Read / write-idempotent / write-non-idempotent / irreversible / human-gated | Treating a wire transfer like a read |
| 3 | Schema | Typed parameters, return shape, descriptions, examples | Model invents arguments the underlying system rejects |
| 4 | Side effects | What changes in the world, where, and how long the change persists | Surprise mutations, unexpected fan-out, undeclared writes |
| 5 | Error contract | Structured failure shapes the model can branch on | Model parrots a Python traceback to a customer |
| 6 | Operational metadata | Idempotency policy, scope binding, rate limits, audit fields, SLOs | Retries duplicate writes, god-keys leak, calls disappear from logs |
The rest of this module zooms into each field. This chapter teaches you to recognise all six in one pass, so the later chapters do not feel like detached topics.
Field 1 — Identity¶
The identity block is the smallest amount of metadata that lets a contract be referenced, versioned, owned, and deprecated.
The minimum:
name: issue_refund # unique within the agent platform
version: 2.1.0 # semver — the call sends this on every invocation
owner: payments-platform # team that owns the underlying system
purpose: |
Refund a previously captured payment.
Use this when the customer has been charged and now must be made whole.
name is what the model sees in its tool list. It must be stable across versions — that is the whole point of having a version field separately. Renaming a tool is a contract break and goes through the deprecation process (chapter 08).
version is the contract version. It is semver because callers — the agent platform, and through it the model — need to reason about compatibility. A minor bump (2.1.0 → 2.2.0) means new optional fields or new optional behaviours; existing calls still work. A major bump (2.x → 3.0.0) means a breaking change and triggers a dual-run window. The call carries the version it was built against; the contract layer can refuse calls older than a documented floor.
owner is a team handle, not a person. The team owning the underlying system — not the agent platform — owns the contract. The agent platform owns the wiring; the platform team owning Salesforce or payments-svc owns the meaning. When something breaks at 03:00, this is who gets paged.
purpose is one paragraph, written for two readers at once: the model (which reads it as a tool description) and a human reviewer (who decides whether the tool is wired correctly). It must be unambiguous about when the tool is appropriate, not just what it does. Module 01 chapter 03 covered this — descriptions are how the model picks the right tool from a confusing list.
Anti-pattern. Identity is often the most under-specified field because it does not change runtime behaviour. The result is tools without owners, version numbers stuck at "1.0.0" forever, and purpose strings copy-pasted from internal docs. When a postmortem asks "who decided this tool would do that?", there is no defensible answer.
Field 2 — Class¶
Class is the most consequential field. Every governance decision downstream — approval gates, retry behaviour, audit retention, who can deploy — is keyed off it. Chapter 03 builds the four classes in detail; here we only fix the slot in the contract.
Minimum slot:
class: write-non-idempotent
class_rationale: |
Refunds change real money state in the payments system and emit
customer-visible notifications. Cannot be made idempotent at this
layer because the upstream system permits multiple partial refunds
per payment within a 30-day window.
class is one of a small enum:
read— fetches data, has no side effects on the production system or the userwrite-idempotent— changes state but the same key produces the same resultwrite-non-idempotent— changes state and repeating the call duplicates the changeirreversible— write that cannot be undone via the same tool or any otherhuman-gated— requires an out-of-band human approval before execution
class_rationale is a sentence or two explaining why the class is what it is. This is not documentation theatre. The rationale forces the contract author to defend the choice and lets a reviewer challenge it. Without rationale, classes drift toward "read" because read is the easiest to wire.
Anti-pattern. A tool that reads-then-writes inside one call (a "get-or-create"). It is two classes packed into one boundary. The right move is to split the contract: one read-class call and one write-class call. If the upstream API forces them together, the contract still records it as the stricter of the two.
Field 3 — Schema¶
Schema is the typed parameter and return surface. JSON Schema (or its equivalent in your stack) is the lingua franca because both the model platform and validators speak it. Module 01 chapter 03 covered schema design from the model's side — parameter naming, enums over free-text, optional vs required, examples. This chapter fixes the contract slot.
Minimum slot:
schema:
parameters:
type: object
additionalProperties: false # critical — see anti-pattern below
required: [payment_id, amount_minor, currency, reason_code]
properties:
payment_id:
type: string
pattern: "^pay_[A-Za-z0-9]{16}$"
description: |
ID of the payment to refund. Must be a payment that has been
captured and is within 30 days of capture. Find this on the
customer's order detail page.
amount_minor:
type: integer
minimum: 1
description: |
Refund amount in minor currency units (paise for INR, cents for USD).
Must not exceed the original payment amount minus any prior refunds.
currency:
type: string
enum: [INR, USD, EUR, GBP]
reason_code:
type: string
enum: [customer_request, fraud, duplicate, item_not_received, other]
idempotency_key:
type: string
pattern: "^[A-Za-z0-9_-]{16,64}$"
description: |
Caller-generated unique key. Re-sending the same key within
24 hours returns the original result without issuing a new refund.
returns:
type: object
required: [refund_id, status, amount_minor, currency, created_at]
properties:
refund_id: { type: string }
status: { type: string, enum: [pending, succeeded, failed] }
amount_minor: { type: integer }
currency: { type: string }
created_at: { type: string, format: date-time }
Three rules carry most of the value.
additionalProperties: false on input. Without this, the model can pass extra fields and the contract has no way to refuse them. This is how silent drift starts: someone adds metadata: {custom_tag: "..."} because the upstream system accepts it, no one updates the contract, and the next reviewer cannot tell whether the extra field is supported or smuggled.
Descriptions are the model's documentation. The schema is read by the model at every call (it appears in the system prompt). Descriptions that say what to put in the field, where the value comes from, and what constraints apply are how the model picks correct arguments.
Return shape is part of the contract. If you only schema-check the input, the model can be given any return shape the underlying system feels like producing. The model then breaks downstream because it expected status and got state. Schema-check the return.
Anti-pattern. Schemas that use type: string, description: "..." for everything. The model will fill them with anything plausible. Use enums, patterns, integer types with min/max, and date-time formats wherever possible. Every untyped field is a bet that the model will guess right.
Field 4 — Side effects¶
Side effects are what changes in the world when the call succeeds. This field exists because schema alone cannot tell a reviewer or a model what the call really does.
Minimum slot:
side_effects:
- kind: state_change
system: payments-svc
object: payment
mutation: |
Marks the payment as refunded (partial or full) and decreases the
refundable balance by amount_minor.
- kind: state_change
system: payments-svc
object: refund
mutation: |
Creates a new refund record linked to the payment.
- kind: notification
channel: customer_email
template: refund_initiated
triggered_when: status == 'succeeded'
- kind: ledger_entry
system: finance-ledger
note: |
Async — within 5 minutes, finance-ledger consumes the
payments-svc event and writes a corresponding refund ledger entry.
reversibility: |
Refunds cannot be reversed via this tool. To reverse, a manual
adjustment via finance-ops is required.
cost_signal: |
Direct cost: refund fees per provider terms (typically zero for cards,
fixed fee for wallets). Indirect cost: customer support time if amount
is incorrect.
Five kinds of side effect cover almost everything:
state_change— a row, document, or object is created, updated, or deleted in a named systemnotification— a human-visible message goes out (email, SMS, push, in-app)ledger_entry— a record is written to a financial or audit log that another system depends onfan_out— the call triggers downstream events that other services consumeexternal_call— the call talks to an external vendor (Stripe, Twilio, Salesforce)
The reversibility field is what flips the class to irreversible if the answer is "no". Reviewers should explicitly require this field, because forgetting to think about reversibility is how irreversible tools get classed as ordinary writes.
The cost_signal field is for chapter 08 of module 01 — cost/latency. Each call has a price; the contract is where that price is recorded.
Anti-pattern. Listing only the obvious state change and missing the fan-out. A tool that "creates a lead" can also enqueue a marketing-automation cascade, schedule a sales-rep follow-up, and increment a quota dashboard. If the contract does not list these, the on-call who sees a sudden spike in marketing emails will not know which tool caused it.
Field 5 — Error contract¶
The error contract is the shape the response takes when the call fails. Chapter 05 builds it in detail; here we record the slot.
Minimum slot:
errors:
- code: PAYMENT_NOT_FOUND
retriable: false
human_hint: |
The payment ID does not exist. Ask the customer to provide their
order number; you can fetch the payment ID from the order.
model_action: |
Do not retry. Use lookup_payment_by_order tool to find the correct ID.
- code: AMOUNT_EXCEEDS_REFUNDABLE
retriable: false
human_hint: |
The requested refund amount exceeds the refundable balance on this
payment (accounting for prior refunds).
model_action: |
Use get_payment tool to read the current refundable_balance,
then re-call this tool with amount_minor <= refundable_balance.
- code: UPSTREAM_TIMEOUT
retriable: true
retry_policy:
strategy: exponential_backoff
base_ms: 500
max_attempts: 3
idempotency_required: true
model_action: |
The contract layer will retry automatically. If retries fail,
surface to the user that the refund is pending verification.
- code: PROVIDER_REJECTED
retriable: false
human_hint: |
The card network or wallet rejected the refund (e.g., closed account).
model_action: |
Surface to the user. Offer alternative refund paths if available
(bank transfer via initiate_bank_refund).
Each error has four parts:
code— a stable identifier the model can branch onretriable— explicit yes/no; not "maybe"human_hint— what a human reading the error would think; helps the model write better user-facing messagesmodel_action— what the model should do next; this is the most underused field in real systems
A retriable error must carry a retry_policy with explicit strategy and bounds. A non-retriable error must carry model_action. Skipping either is how a model ends up retrying a PAYMENT_NOT_FOUND thirty times.
Anti-pattern. Errors that pass through the underlying system's exception class. salesforce.exceptions.SalesforceMalformedRequest: REQUIRED_FIELD_MISSING: lead_source is not an error contract. It is debug output. The model has no idea whether to retry or which corrective action applies.
Field 6 — Operational metadata¶
The remaining operational slots that the rest of this module designs.
Minimum slot:
operational:
idempotency:
required: true
key_field: idempotency_key
dedup_window: 24h # → chapter 04
scope:
required_scope: "payments:refund:write"
tenant_binding: true # → chapter 06
rate_limits:
per_agent_per_minute: 30
per_tenant_per_hour: 1000
observability:
audit_retention: 365d
pii_fields: # → chapter 11
- amount_minor # business-sensitive, not strictly PII
redact_in_logs: []
trace_propagation: w3c
sla:
p95_latency_ms: 2000 # → chapter 09 model_vendor_strategy / cost budgets
availability: "99.5%"
This field exists so that operational properties — the ones that decide what happens when the system is under load or on fire — are written down where the rest of the contract lives, not scattered across runbooks and ops dashboards.
Anti-pattern. Treating operational metadata as "the SRE team will figure it out." The team who knows what rate limit makes sense is the team that owns the underlying system. The contract is the joint artefact where those numbers live.
What a full contract looks like¶
Putting all six fields together produces a single YAML document (or its equivalent in your stack's format) that is the whole contract for one tool. The full version of the refund tool, condensed:
identity:
name: issue_refund
version: 2.1.0
owner: payments-platform
purpose: |
Refund a previously captured payment. Use this when the customer
has been charged and now must be made whole.
class: write-non-idempotent
class_rationale: |
Refunds change real money state and emit customer-visible notifications.
schema:
parameters: { ... as above ... }
returns: { ... as above ... }
side_effects:
- { ... as above ... }
reversibility: |
Refunds cannot be reversed via this tool.
errors:
- { ... as above ... }
operational:
idempotency: { ... }
scope: { ... }
rate_limits: { ... }
observability: { ... }
sla: { ... }
A contract that fits this shape is reviewable, versionable, ownable, and operable. A contract that is missing two of these six fields cannot be operated; you will find out which two when the postmortem asks the question that field would have answered.
How to recognise a usable contract in the wild¶
Walk into a code review for a new tool. Ask these six questions in order:
- Identity — what is this called, what version is it, who owns it, when is it the right tool to pick?
- Class — is this a read, a write, or an irreversible action, and why?
- Schema — what are the parameters, with which types and constraints, and what does it return?
- Side effects — what changes in the world when this succeeds, including the fan-out you do not see?
- Errors — what are the failure shapes, which are retriable, and what should the model do next?
- Operational — what is the idempotency key, what scope does it require, what rate limit, what audit fields?
If any question gets a verbal answer ("we know roughly, it's in someone's head"), the contract does not exist yet. The artefact is the answer.
Interview Q&A¶
Q1. A teammate writes a new tool as a 30-line Python function and asks for review. What is your first question? "Where is the contract?" — the Python function is the call site, not the contract. Then walk the six fields: identity (name/version/owner/purpose), class, schema (typed, additionalProperties: false), side effects (especially fan-out and reversibility), error shapes (retriable/non-retriable, model_action), operational metadata. Any missing slot is a question, not necessarily a blocker. Wrong-answer notes: jumping straight to "does the schema validate?" misses identity and class, which decide governance.
Q2. What is the role of additionalProperties: false in a tool schema?
It refuses inputs the contract did not declare. Without it, the model can pass extra fields the underlying system happens to accept, and over time those extra fields become load-bearing without ever appearing in the contract. The contract stops being authoritative and you lose the ability to detect drift. Wrong-answer notes: "to validate input" is too vague. The specific value is preventing silent extension.
Q3. Why is owner a team handle, not a person?
Because the contract outlives any individual. The team owning the underlying system owns the meaning of the call — what it does, when it should be used, what its errors mean — and that team rotates members, runs on-call, and manages deprecation. A person handle is unreachable when that person changes teams; a team handle is durable. Wrong-answer notes: "to spread responsibility" is missing the durability point.
Q4. The product team says "we cannot list reversibility because some refunds are partially reversible via a bank transfer through a different system." What do you do?
Record exactly that in the reversibility field. The field exists to make the answer explicit, not to force "yes" or "no". "Reversible via initiate_bank_refund tool within 7 days for INR transactions only" is a valid value. If the model needs to perform the reversal, the contract should also link to the reversing tool. Wrong-answer notes: "mark it irreversible" loses real capability; "leave the field empty" loses the answer entirely.
What to do differently after reading this¶
- Use a single template for all new tool contracts. Six top-level slots. Refuse to merge a tool that is missing any of them.
- Maintain a contract registry — even if it is a directory of YAML files — separate from agent code. The registry is the source of truth; the runtime loads from it.
- For each existing tool, run the six-question review. Write down which slots are missing. Prioritise filling them by class: irreversible and write-non-idempotent first, then writes, then reads.
- When a tool grows a new requirement (a new field, a new error case), the change goes to the contract first, the runtime second.
Bridge. Of the six fields, one decides every downstream governance question — approval gates, retry behaviour, audit retention, deploy permission. That field is the class. The next chapter builds the four authority classes from the blast radius they cover, and shows what changes in the rest of the contract depending on which class you pick. → 03-read-write-irreversible-classes.md