Skip to content

07. Boundary and tradeoff review — choosing on a market that won't hold still

~20 min read. The bank's decision is made, defended, and shipped. Then, three months later, a framework you ruled out gets acquired by your cloud, a SaaS vendor flips to outcome-based pricing that halves your projected bill, and a runtime ships the one gate-passing feature that would have flipped the whole evaluation. Nothing about the decision was wrong. The market just moved underneath it.

Built on every file before it. You can now choose by the boundary, score the capability rubric, price the 100× curve, keep the escape hatch open, and sieve the procurement gate. This file does something different: it steps back from how to decide to how to keep a decision honest in a market that reshuffles every quarter. The pressure here is ambiguity — the facts that the earlier files treated as stable are the ones most likely to change. The recurring idea is decide on what's slow-moving, hold loosely what's fast-moving, and schedule the re-evaluation before you need it.


What every earlier file quietly assumed

Each earlier file produced a number or a verdict and treated its inputs as fixed. The rubric scored capability as of today. The cost curve projected the bill at today's prices. The gate sieve eliminated a platform because, right now, it can't sign with your key. All of that is correct — and all of it has a half-life. A capability score decays the moment a runtime ships a feature. A cost projection decays the moment a SaaS vendor switches from per-seat to outcome-based pricing. A gate verdict decays the moment a vendor earns a residency region or a BAA it lacked last quarter.

So the honest question is not "which platform is best." We retired that one in file 00. The honest question is "which parts of my decision rest on facts that will still be true in six months, and which parts rest on facts that will have churned?" A senior evaluator separates the two on purpose. The slow-moving facts — your data's gravity, your team's size, your regulatory boundary, who owns the model call — anchor the decision. The fast-moving facts — exact prices, this quarter's model menu, which vendor demos best — are inputs you hold loosely and revisit on a schedule. Confuse the two and you either thrash (rebuilding every quarter on the latest benchmark) or ossify (running a 2024 decision in a 2026 market).

What this file solves

A team makes a defensible platform decision, ships it, and then watches the market invalidate three of its inputs within two quarters — a price change, an acquisition, a new gate-passing feature — and panics, either thrashing toward the new shiny thing or refusing to revisit a decision that has genuinely gone stale. This file gives you a way to tell durable inputs from volatile ones, both sides of the live debates (frameworks vs managed runtimes vs SaaS verticals), the vendor claims worth distrusting, and a concrete six-month re-evaluation trigger list. The first move is to tag every input in your decision as slow-moving or fast-moving at decision time, because the inputs you tag fast-moving are exactly the ones you'll set a review date against instead of betting the architecture on.

Why "we chose, we're done" is the expensive failure

The tempting posture after a hard decision is finality: the rubric is scored, procurement signed off, the bridge is burned, move on. It feels like discipline. It is the opposite. A platform decision is not a one-time act; it is a position you hold in a moving market, and a position you never re-examine is a position that silently drifts from optimal to wrong. The bank that picked AgentCore did not buy permanent correctness. It bought a current best fit under current facts — and at least three of those facts (the cost curve's assumed prices, the rubric's capability scores, the rival's gate-failures) are the kind that move.

The naive repair is the opposite extreme: re-evaluate constantly, chase every benchmark, re-score the rubric each time a vendor ships. That thrashes. You burn the team's attention re-deciding instead of building, and you never accumulate the operational depth on one platform that actually makes an agent good. So the real problem is not that the decision was wrong or right; it is that a static decision and a constant re-decision both ignore the structure of the change — some inputs churn fast, some barely move. So how do you revisit without thrashing? You decide on the slow-moving anchors, you tag the fast-moving inputs at decision time, and you bind the re-evaluation to specific triggers and a calendar date rather than to your anxiety.

The re-evaluation rule. Anchor the decision on slow-moving facts (data gravity, team size, regulatory boundary, model-call ownership) and hold fast-moving facts (prices, model menus, this quarter's feature gaps) loosely. Schedule a re-evaluation against named triggers, not against the news cycle — revisit when an anchor changes or a tagged trigger fires, not every time a vendor ships.

Why this rule exists. The primitive is that decision inputs have different half-lives, and a single verdict freezes them all at one instant. The constraint is human attention: you cannot re-decide continuously and still build, but you also cannot run a stale decision in a market growing ~40% a year. The resolution is to sort inputs by half-life — the long-half-life facts carry the decision, the short-half-life facts get a review date — so the decision is stable where stability is cheap and revisable where revision is necessary.


1) Slow-moving anchors vs fast-moving inputs — sorting your decision by half-life

Every input to a platform decision sits somewhere on a spectrum of how fast it changes. Sort them at decision time, because the sort decides what you bet the architecture on and what you merely schedule a review against.

SLOW-MOVING ANCHORS                          FAST-MOVING INPUTS
(bet the architecture on these)              (hold loosely, set a review date)
──────────────────────────────              ──────────────────────────────
where your data physically lives             exact per-token / per-action prices
your team's size & ops capacity              this quarter's model menu
your regulatory boundary                     which vendor demos best today
who owns the model call (control)            a specific feature gap (may close)
your traffic's order of magnitude            vendor funding / acquisition status
the escape-hatch architecture you built      benchmark leaderboard positions

The left column changes on the timescale of a re-org or a regulatory shift — quarters to years. The bank's data is on AWS; that will not flip next month. Its team is six people; that won't double silently. Its regulator requires a single residency boundary; that is a multi-year fact. These anchored the AgentCore decision and they are why the decision is stable.

The right column changes on the timescale of a product release — weeks to a quarter. Prices move (Salesforce shipped three pricing models for Agentforce in roughly eighteen months — per-conversation, then per-action Flex Credits, then per-user seats). Model menus grow monthly. A feature gap that scored a rival down at 2 on the rubric can close in a single GA announcement. Betting your architecture on any of these is betting on a snapshot of a movie.

Teacher voice. The discipline is not "predict the future." Nobody on the bank's team knows what Vertex's memory pricing will be in a year. The discipline is knowing which of your inputs you don't need to predict because they barely move, and which you must schedule a look at because they move fast. You anchor on the slow ones and you put a calendar reminder on the fast ones. That is the whole technique.


2) Picture first — the half-life sort and the review trigger

   decision time
   ┌─────────────────────────────────────────────┐
   │  sort every input by half-life              │
   └───────────────┬─────────────────┬───────────┘
                   │                 │
          slow-moving           fast-moving
          (anchors)             (volatile)
                   │                 │
                   ▼                 ▼
        ┌──────────────────┐  ┌──────────────────────┐
        │ bet architecture │  │ tag with a TRIGGER:   │
        │ on these — they  │  │  • price drop > 30%   │
        │ carry the verdict│  │  • anchor changed     │
        └──────────────────┘  │  • gate now passable  │
                   │          │  • 100× curve crossed │
                   │          └───────────┬───────────┘
                   │                      │
                   └──────────┬───────────┘
                  ┌───────────────────────────┐
                  │ re-evaluate WHEN a trigger │
                  │ fires OR every ~6 months,  │
                  │ NOT every vendor release   │
                  └───────────────────────────┘

The decisive fork is the half-life sort at the top. Inputs that sort slow go into the architecture and don't get a review date — revisiting them is expensive and pointless because they don't move. Inputs that sort fast never go into the architecture; they get a trigger and a calendar date. The whole point of the diagram is that re-evaluation is bound to triggers, not to the steady drip of vendor announcements. The bank reviews when its data moves clouds, when a price drops by a third, or every six months — not when Vertex ships a new model.


3) The bank revisits — six months after picking AgentCore

Walk the running example forward. The bank shipped both agents on AgentCore with an owned LangGraph orchestration. Six months pass. The platform team runs the scheduled re-evaluation. Two paths present themselves — and the contrast is the lesson.

Attempt A — the thrash: re-score everything against the latest releases

The eager engineer re-runs the full rubric from scratch against every platform's newest capabilities. In six months Vertex shipped cheaper memory, a SaaS vertical (Sierra-style) flipped to pure outcome-based pricing that looks cheaper at the bank's FAQ volume, and OpenAI's AgentKit added enterprise workspace agents. The re-score churns: three platforms now look competitive, the engineer drafts a migration proposal, and the team spends a sprint debating a rebuild.

This is the trap. The re-score treated every input as live, including the anchors. But the anchors didn't move — the bank's data is still on AWS, the team is still six people, the regulator still wants a single residency boundary, the bank still owns the model call. Re-scoring the slow-moving facts wastes the review and manufactures false optionality. Not a "the market changed so we must move" situation; a "we re-litigated facts that didn't change" situation.

Attempt B — the disciplined review: check the triggers, hold the anchors

The lead runs the review against the tagged triggers only.

SCHEDULED RE-EVALUATION — month 6, AgentCore incumbent

ANCHORS (re-confirm, don't re-score):
  data on AWS?              still yes      → AgentCore residency gate still single-boundary
  team size?                still 6        → still can't own LangGraph ops solo (no self-host)
  regulator: single bndry?  still yes      → cross-cloud still fails the residency gate
  bank owns model call?     still yes      → audit/liability anchor intact
  → ANCHORS UNCHANGED. The structural decision still holds.

TRIGGERS (the fast-moving inputs we tagged at decision time):
  T1 price drop > 30%?      Sierra-style outcome pricing looks ~35% cheaper at FAQ volume
                            → INVESTIGATE: but it's a SaaS vertical → fails model-call gate
                            → outcome pricing is real, but the gate still eliminates it. NO MOVE.
  T2 gate now passable?     did any cross-cloud runtime earn single-boundary AWS residency?
                            → NO. residency anchor holds. NO MOVE.
  T3 100× curve crossed?    FAQ volume now 3× the pilot, on track for 100× in 9 months
                            → ACT: renegotiate committed-use / provisioned throughput NOW,
                              before the curve bends (file 05). This is a real action.
  T4 escape hatch intact?   are prompts/tools/data still portable, logs still exported?
                            → audit drift: 1 new tool wired AWS-specific. FIX: re-abstract it.

VERDICT: stay on AgentCore. Two real actions (renegotiate pricing tier, re-abstract one
         tool), zero migration. The review took an afternoon, not a sprint.

The disciplined review confirms the anchors in minutes (they didn't move), checks the four tagged triggers, and produces two cheap actions and zero migration. The Sierra-style outcome pricing — genuinely attractive on cost — is killed by the same model-call gate from file 06, which is an anchor, not a price. The one thing that genuinely changed (FAQ volume climbing toward the 100× curve) produces a pricing renegotiation, not a platform switch. The escape hatch caught a small drift — one tool wired AWS-specific — which gets re-abstracted cheaply because it was caught at six months instead of at exit.

Mini-FAQ. "If the outcome pricing is 35% cheaper, isn't refusing it just stubbornness?" No — it's the gate doing its job. Outcome-based pricing on a SaaS vertical means the vendor owns the model call, which fails the bank's audit and model-call gates from file 06. A 35% cost saving doesn't buy back a gate failure any more than a high rubric score did. The price is a fast-moving input; the gate is a slow-moving anchor. The anchor wins. For an unregulated team the same 35% might absolutely justify the move — which is the point: the sort depends on which facts are anchors for you.


4) Why a trigger list and not continuous monitoring — under this workload

The plausible alternative to a six-month review with triggers is continuous monitoring: dashboard every vendor's prices, features, and benchmarks, alert on any change, re-decide in real time. Why the coarse periodic review instead of the fine-grained always-on one?

Because the cost of re-deciding dominates the benefit of deciding sooner, for a platform decision specifically. Switching agentic platforms is a quarter-scale project (the exit-cost math from file 04). A signal that arrives a month earlier buys you nothing when acting on it takes a quarter and you only act when an anchor moves. Continuous monitoring optimizes the latency of a decision whose latency doesn't matter, while taxing the scarce resource that does matter — the team's attention. The six-person bank team that watches a vendor-features dashboard daily is six people not building agents.

The trigger list is the right structure because platform inputs are lumpy and slow-acting: the changes worth reacting to (a residency region earned, a 30%+ price move, your own volume crossing an order of magnitude) are rare and large, not frequent and small. You want a low-frequency, high-specificity check — does any named, decision-flipping trigger fire? — not a high-frequency, low-specificity stream of every release note. Continuous monitoring would be correct if switching were cheap and inputs changed smoothly; for agentic platforms switching is expensive and inputs change in rare jumps, so periodic-with-triggers fits the workload.


5) The property that decides how often to revisit: switching cost relative to input volatility

The single dimension that sets your re-evaluation cadence is the ratio of switching cost to input volatility. It tells you whether to anchor hard and review rarely, or stay loose and review often.

              switching cost LOW          switching cost HIGH
            ────────────────────────    ────────────────────────
inputs      revisit freely; the         ANCHOR HARD, review on
volatile    cost of moving is low so    triggers only — moving is
(prices,    chase the better deal       a quarter's work, so don't
features)   (e.g. model API choice)     thrash on volatile inputs
                                         ◄── the bank's agentic platform

inputs      barely revisit; nothing     decide once, carefully, and
stable      changes and moving is        hold for years — the rare
(data       cheap, so it rarely          change is the only trigger
gravity)    matters either way          (e.g. core data platform)

The bank sits in the top-right: switching cost is high (a quarter to migrate) and the volatile inputs (prices, features) churn fast. That combination is exactly the one that demands hard anchoring and trigger-bound review — because reacting to every volatile change would mean a perpetual quarter-long migration, while ignoring a moved anchor would mean running a stale decision. The model-vendor choice one layer down (module 12) often sits top-left instead: swapping a model behind an abstracted call is cheap, so you can chase the better-priced model freely. Same market, different switching cost, opposite cadence. This is why the escape hatch from file 04 matters so much here — it lowers your switching cost, which moves you leftward and lets you revisit more freely without thrashing.


6) A claim walked through deeply — "outcome-based pricing means you only pay for value"

Take one contested vendor claim apart, because its shape recurs across every pitch on this market. Sierra and similar SaaS verticals market pure outcome-based pricing: you pay only when the agent resolves an issue without a human. The claim sounds unbeatable — no resolution, no charge.

THE CLAIM:  "outcome-based pricing — pay only for successful resolutions"

WHERE IT'S TRUE:
  at low-to-mid volume with messy traffic, you don't pay for the agent's
  failures, and the vendor is incentivized to actually resolve. Real benefit.

WHERE IT BENDS (the parts the pitch omits):
  1. WHO DEFINES "RESOLUTION"? the vendor does. a deflection the customer
     re-contacts about an hour later may still count as resolved → you pay
     for resolutions that didn't resolve.
  2. THE 100× CURVE (file 05): at high resolution-rate and high volume,
     per-outcome can cost MORE than owned-orchestration token cost. the
     bank's 70%-FAQ traffic resolves cheaply on a small model in owned
     orchestration; paying a per-resolution fee on that volume is a markup.
  3. THE MODEL-CALL GATE (file 06): outcome pricing comes bundled with the
     vendor owning the model call → fails the bank's audit + liability gates.
  4. LOCK-IN (file 04): outcome pricing deepens data gravity — your
     resolution history, the thing that proves the pricing fair, lives in
     the vendor's cloud.

VERDICT: the claim is true for the workload it's pitched against (mid-volume,
  unregulated, no owned model call). It bends hard against the bank's
  workload. The pricing model is not "better" — it's better FOR A DIFFERENT
  USE CASE. read the claim against YOUR anchors, not the vendor's slide.

The failure shape is universal: a vendor claim is true for the workload it was designed against and silently false for yours. The pitch never states the workload it assumes. Not a dishonest claim; a claim with an unstated workload. The discipline is to re-attach the workload — at what volume, under what regulation, with whose model call — and check it against your anchors. Outcome pricing is genuinely good for a mid-volume unregulated support agent. For a regulated bank that owns its model call and runs high FAQ volume, it's a markup wrapped in a gate failure.


7) Cost movement — what the review cadence buys and what it costs

Posture Effort What it catches What it misses / costs When it bites
Decide once, never revisit zero ongoing nothing every moved anchor and fired trigger when an anchor moves and you don't notice for a year
Continuous monitoring high ongoing every change instantly drowns the team in low-value signals; taxes attention when switching cost is high and you act on noise
6-month review + triggers a day every 6 months + alerts on named triggers moved anchors, decision-flipping changes sub-threshold drift between reviews (caught next cycle) almost never; this is the fit for high-switching-cost platforms
Trigger-only, no calendar minimal named triggers the slow drift no single trigger names (escape-hatch erosion) when drift accumulates below every trigger

The movement: the periodic-plus-trigger posture costs a day twice a year plus a few alerts, and it catches the two things that matter — a moved anchor and a decision-flipping trigger — while accepting that small drift waits for the next cycle. The subsystem that pays is the lead's attention, twice a year, in bounded chunks. The reason a calendar date sits alongside the triggers is the last row's failure: escape-hatch erosion (a tool wired platform-specific, a log that stopped exporting) is slow drift that no single trigger names, so only a periodic look catches it before it compounds into real lock-in.


8) Operational signals — is the re-evaluation discipline healthy?

Healthy. Every decision input was tagged slow or fast at decision time, the anchors are re-confirmed (not re-scored) on a calendar cadence, triggers are named and alerting, and a review produces a small number of concrete actions (renegotiate a tier, re-abstract a tool) far more often than it produces a migration. Reviews are short because anchors rarely move.

First metric that degrades. The fraction of a review spent re-scoring anchors. When the team re-runs the full rubric against unchanged data-gravity and team-size facts, the review is thrashing — re-litigating the slow-moving inputs that should have been confirmed in minutes. Watch for review duration creeping from an afternoon toward a sprint.

Misleading metric people watch. The vendor benchmark leaderboard and the "who shipped what this quarter" feed. These move constantly and feel like signal, but almost none of it touches an anchor; reacting to leaderboard churn is the continuous-monitoring trap. A platform climbing a benchmark is not a trigger unless it crosses one of your named, decision-flipping thresholds.

The signal an experienced lead checks first. "Did any anchor move since the last review — data location, team size, regulatory boundary, model-call ownership?" If no anchor moved, the structural decision holds and the review collapses to checking the fast-moving triggers. A lead reads the anchors first and the vendor news last, because anchors flip decisions and news rarely does.


9) Boundary of applicability — when scheduled re-evaluation helps and when it doesn't

Strong fit. High-switching-cost platform decisions in a fast-moving market with high-volatility inputs — exactly the agentic-platform choice. When migrating costs a quarter and prices/features churn monthly, anchoring hard and reviewing on triggers is the only posture that avoids both thrash and ossification. The bank is the textbook case.

Where the discipline becomes pathological. Low-switching-cost decisions, where the trigger-and-anchor machinery is overkill. If swapping is a config change — choosing which model API to call behind an abstraction — you don't need a six-month review and a trigger list; you just switch when a better option appears. Imposing the bank's heavyweight review cadence on a cheap-to-reverse choice is process theater that slows a decision that wanted to be fast.

Scale / regime that invalidates the intuition. "Decide once and you're done" holds only in a static market with stable inputs — which the agentic-platform market emphatically is not (≈40% CAGR, three Agentforce pricing models in eighteen months, frameworks merging and being acquired). The intuition that breaks most often is treating a platform pick as permanent; in a market this fast, permanence is how a good 2024 decision becomes a bad 2026 reality. The opposite intuition — "always chase the latest" — breaks just as hard when switching costs a quarter. Both extremes ignore the half-life sort.


10) The wrong model to drop: "the right platform exists and we just need to find it"

The seductive wrong idea — the one this whole module has been chipping at — is that there is a correct platform, discoverable by enough evaluation, and once found the decision is over. This file delivers the final correction: there is no correct platform, and even the best current fit is a position in a moving market, not a permanent answer. The market grew ~40% this year, a major SaaS vendor shipped three pricing models in eighteen months, frameworks are being acquired and merged, and runtimes ship gate-flipping features every quarter. A decision frozen against that motion rots.

Replace "find the right platform" with "make the best current decision on slow-moving anchors, hold the fast-moving inputs loosely, and schedule the look that keeps the decision honest." The bank didn't find a permanently-correct platform; it found the best fit under its anchors and built an escape hatch and a review cadence so that when the market moves — and it will — the decision can move with it cheaply, or hold deliberately, but never drift unnoticed.


11) Other failure shapes in a moving market

  • Decision ossification — running a stale pick because "we already decided," long after an anchor moved (data migrated clouds, team grew, regulation changed).
  • Benchmark thrash — re-deciding every time a platform tops a leaderboard, burning the team's attention and never building operational depth on one platform.
  • Unstated-workload claims — taking a vendor claim ("only pay for outcomes," "200+ models") at face value without re-attaching the workload it silently assumes.
  • Escape-hatch erosion — small platform-specific choices accreting between reviews (a tool wired to the cloud's networking, a log that stopped exporting) until the hatch is quietly welded shut.
  • Anchor re-litigation — spending a review re-scoring the slow-moving facts that didn't change, instead of confirming them in minutes and checking the triggers.
  • Pilot-price extrapolation — projecting the bill on pilot-era pricing when the vendor's pricing model itself is the volatile input most likely to change (Agentforce's three models).
  • Acquisition surprise — a framework or SaaS vendor acquired by a cloud (or each other), changing its gravity, pricing, or roadmap overnight, with no trigger watching for it.
  • Feature-gap permanence — ruling a platform out forever on a gap that closes in one GA announcement, instead of tagging it as a fast-moving trigger to re-check.

12) Pattern transfer — where "decide on a moving target" recurs

  • Cost & scaling (file 05) — the 100× curve is itself a fast-moving input: prices change, so the projected bill has a half-life, and crossing an order of magnitude in volume is a named trigger that reopens the cost decision without reopening the platform decision.
  • Lock-in & the escape hatch (file 04) — the escape hatch is what lowers switching cost, which moves you leftward in section 5's grid and lets you revisit more freely; escape-hatch erosion is the slow drift that the calendar review (not any single trigger) exists to catch.
  • Procurement gates (file 06) — gate verdicts have a half-life too: a vendor can earn a residency region or a BAA it lacked, turning a gate-failure into a gate-pass; "gate now passable?" is a named re-evaluation trigger.
  • Model vendor strategy (module 12) — the same half-life sort applies one layer down, but with low switching cost (abstracted model calls), so the cadence flips to "revisit freely" — same market, different switching cost, opposite discipline.
  • Build-vs-buy (file 01) — the original boundary choice rests on anchors (control, gravity, team size); this file is that decision's maintenance contract — the boundary holds until an anchor moves.

13) The re-evaluation audit — five yes/no questions

  1. Was every decision input tagged slow-moving (anchor) or fast-moving (volatile) at decision time, and is the list written down?
  2. Are the re-evaluation triggers named and specific — a price move over a threshold, an anchor change, a gate now passable, a volume order-of-magnitude crossed — rather than "when something changes"?
  3. Does the scheduled review confirm anchors in minutes rather than re-scoring them, and spend its time on the triggers?
  4. Is there an escape hatch keeping switching cost low enough that revisiting is a real option, and is the review checking the hatch for erosion?
  5. When a vendor claim arrives (outcome pricing, model-menu size), do you re-attach the unstated workload and check it against your anchors before believing it?

If question 1 is "no," you can't tell thrash from diligence, because you never separated the inputs that should move the decision from the ones that shouldn't.


Where this appears in production

The market motion that ages a decision: - Salesforce Agentforce's three pricing models — per-conversation, then per-action Flex Credits, then per-user seats in roughly eighteen months; any cost projection pinned to one model had a half-life of months, the textbook fast-moving input. - Sierra's outcome-based pricing — pay only for resolutions; genuinely attractive for mid-volume unregulated support, but bundles vendor-owned model calls that fail a regulated bank's audit gate. - Microsoft's Agent Framework GA — AutoGen and Semantic Kernel merged into one framework with production SLAs; a framework consolidation that changes the landscape map a prior decision was scored against. - AWS Strands + AgentCore — a framework-plus-runtime path that tightens AWS gravity; for an AWS-native shop it lowers switching cost within the cloud and raises it across clouds. - LangGraph's scale adoption — tens of millions of monthly downloads with durable execution and checkpointing; the kind of momentum that turns a "risky young framework" trigger into a non-issue between reviews. - OpenAI AgentKit / workspace agents — enterprise agents plugging into Slack and Salesforce, with introductory free pricing converting to credit-based; an entrant whose pricing is explicitly a moving target. - Sierra's 2026 acquisitions — voice, enterprise-AI, and workflow startups acquired in months; an acquisition-surprise trigger that can change a vendor's gravity and roadmap overnight.

Anchors that held while the market churned: - A bank's data gravity on AWS — unchanged across reviews, keeping the residency gate single-boundary and the AgentCore decision structurally stable. - A six-person team's ops capacity — unchanged, keeping self-hosted LangGraph off the table regardless of how its capability scores climbed. - A regulator's single-residency-boundary rule — a multi-year anchor that no quarterly vendor feature could satisfy from another cloud. - Model-call ownership — the bank owning its inference call, the anchor that killed every outcome-priced SaaS pitch no matter how cheap.

Where ignoring the half-life sort bit teams: - A team that re-scored the full rubric quarterly — thrashed on volatile features, drafted migrations it never executed, and never built depth on one platform. - A team that never revisited — ran a 2024 SaaS pick into a 2026 market, missing an outcome-pricing model that would have halved its bill on an unregulated workload. - A team that ruled out a runtime on a feature gap — never re-checked, and the gap had closed two quarters earlier in a GA release. - A team blindsided by an acquisition — its chosen framework was acquired by a rival cloud, changing the gravity it had anchored on, with no trigger watching.


Pause and recall

  1. What distinguishes a slow-moving anchor from a fast-moving input? Give two examples of each from the bank.
  2. Why is "we chose, we're done" an expensive failure in this market, and why is "re-decide constantly" equally bad?
  3. What ratio sets your re-evaluation cadence, and where does the bank's agentic-platform choice sit in that grid?
  4. The bank's six-month review found Sierra-style outcome pricing ~35% cheaper. Why didn't it move?
  5. Why a periodic review with named triggers instead of continuous monitoring, for this specific workload?
  6. What is an "unstated-workload claim," and how do you defuse one?
  7. Why does the review need a calendar date and triggers, not triggers alone?
  8. Why is there no "right platform," and what replaces the search for one?

Interview Q&A

Q1. You picked an agentic platform six months ago. The market has moved. How do you decide whether to revisit? A. Sort the decision's inputs by half-life. Slow-moving anchors — data gravity, team size, regulatory boundary, who owns the model call — carry the decision and rarely move; I confirm them in minutes. Fast-moving inputs — prices, model menus, feature gaps — I tagged at decision time with named triggers. I re-evaluate when an anchor moves or a trigger fires, or on a six-month calendar to catch slow drift like escape-hatch erosion — not every time a vendor ships. A review usually produces small actions (renegotiate a tier, re-abstract a tool), not a migration. Common wrong answer to avoid: "Re-run the full rubric against the latest releases." That re-litigates the unchanged anchors and thrashes; confirm anchors, check triggers.

Q2. A SaaS vendor offers outcome-based pricing that's 35% cheaper than your owned orchestration. Why might you decline? A. Because price is a fast-moving input and the model-call gate is a slow-moving anchor. Outcome pricing on a SaaS vertical means the vendor owns the model call, which fails our audit and liability gates for a regulated workload — and a 35% saving doesn't buy back a gate failure any more than a high rubric score does. The claim is also true for a different workload: mid-volume, unregulated, no owned model call. For that team it's a great deal. For ours, it's a markup wrapped in a gate failure, so the anchor wins. Common wrong answer to avoid: "35% cheaper, switch." Cost is volatile and compensatory; the gate is binary and non-compensatory — re-attach the unstated workload first.

Q3. Why not continuously monitor every vendor's prices and features and re-decide in real time? A. Because switching agentic platforms costs a quarter, so a signal arriving a month earlier buys nothing — you only act when an anchor moves, and acting takes a quarter regardless. Continuous monitoring optimizes a latency that doesn't matter while taxing the resource that does: the team's attention. Platform inputs are lumpy and slow — the changes worth reacting to are rare and large, so a low-frequency, high-specificity trigger check fits better than a high-frequency stream of release notes. Common wrong answer to avoid: "More monitoring is always safer." When switching is expensive and inputs jump rarely, continuous monitoring is just attention tax on noise.

Q4. What sets how often you should re-evaluate a platform decision? A. The ratio of switching cost to input volatility. High switching cost with volatile inputs — the agentic-platform case — means anchor hard and review on triggers, because reacting to every volatile change would mean a perpetual quarter-long migration. Low switching cost — like swapping a model behind an abstraction — means revisit freely. Same market, different switching cost, opposite cadence. The escape hatch matters precisely because it lowers switching cost and lets you revisit more freely without thrashing. Common wrong answer to avoid: "Review on a fixed schedule regardless." Cadence should follow switching cost and volatility, not a calendar alone.

Q5. A teammate wants to rule out a runtime forever because it can't make in-VPC calls today. Push back. A. A feature gap is a fast-moving input, not an anchor. Runtimes ship gate-flipping features every quarter; ruling a platform out forever on a gap that may close in one GA announcement is feature-gap permanence. Tag it as a trigger — "re-check in-VPC support next review" — and rule it out for now, not forever. Contrast that with an anchor like our data's cloud, which won't change next quarter and genuinely does rule platforms out structurally. Common wrong answer to avoid: "If it can't do it today, it's dead to us." That confuses a volatile feature gap with a structural anchor; tag it and re-check.

Q6. Walk the bank's whole decision and say which parts were anchors and which were the market's noise. (cumulative) A. File 01 chose the runtime family on anchors — control, gravity, team size. File 02 narrowed by gravity (data on AWS — an anchor). File 03's rubric scored capability, much of which is volatile (model menus, feature parity). File 05's cost curve used prices, the most volatile input of all. File 06's gates rest on anchors (residency, model-call ownership) but a gate verdict can flip if a vendor earns a region. So the structural decision — AWS-native runtime, owned model call — rests on anchors and holds; the rubric scores and cost projection rest on volatile inputs and get re-checked on triggers. The escape hatch (file 04) keeps switching cost low so the volatile parts can move the decision cheaply if they ever cross a trigger. Common wrong answer to avoid: "Every file's output is equally permanent." The gates and gravity are anchors; the rubric scores and cost projection are snapshots with a half-life.


Design/debug exercise (10 min)

Step 1 — Model it. Sort the bank's decision inputs by half-life and tag the volatile ones with a trigger:

INPUT                        HALF-LIFE   ROLE      TRIGGER (if volatile)
data on AWS                  years       anchor    —
team size (6)                year+       anchor    —
regulator: single boundary   years       anchor    —
bank owns model call         years       anchor    —
per-token / per-action price  weeks      volatile  price move > 30%
model menu size              monthly     volatile  (rarely decision-flipping)
rival's in-VPC gap           quarterly   volatile  gate now passable?
FAQ traffic volume           quarterly   volatile  crosses an order of magnitude
escape-hatch integrity       slow drift  volatile  calendar review (no single trigger)
→ anchors carry the decision; volatiles get triggers + a 6-month calendar look

Step 2 — Your turn. Run the six-month re-evaluation on the bank's internal-ops agent (KYC, compliance memos). Its anchors are stricter — residency may now demand the sovereign region (file 04), data sensitivity is higher. Re-confirm its anchors, then check whether any trigger fired (a sovereign-region requirement landing is an anchor change — does it flip the platform?). Decide on actions. Then run the same sort-and-trigger review on a platform decision from your own backlog: tag every input slow or fast, name the triggers, and set the calendar date.

Step 3 — Reproduce from memory. Redraw the half-life sort (slow anchors vs fast volatiles → bet vs trigger) and the switching-cost / volatility grid cold, then connect it to file 04: the escape hatch lowers switching cost, which moves you leftward in the grid and lets you revisit freely without thrashing. If you can explain why the bank stayed on AgentCore despite 35%-cheaper outcome pricing — anchor beats price — you own this chapter, and the module.


Operational memory

This chapter stepped back from how to decide to how to keep a decision honest in a market that won't hold still. The important idea is that decision inputs have different half-lives: slow-moving anchors (data gravity, team size, regulatory boundary, who owns the model call) carry the decision and barely move, while fast-moving inputs (prices, model menus, feature gaps) churn monthly and must be held loosely. A static decision freezes the volatile inputs at one instant and rots; a constant re-decision thrashes on inputs that switching cost makes too expensive to chase. The resolution is to anchor hard, tag the volatile inputs with named triggers, and schedule the look.

You learned to sort every input by half-life at decision time, re-confirm anchors in minutes rather than re-scoring them, bind re-evaluation to named triggers plus a six-month calendar date, and re-attach the unstated workload to any vendor claim before believing it. That solves the opening failure — the market invalidating three inputs in two quarters — because the bank's review confirmed its anchors held, checked four triggers, and produced two cheap actions (renegotiate a pricing tier as volume climbed toward the 100× curve, re-abstract one tool that had drifted platform-specific) and zero migration. The 35%-cheaper outcome pricing was real and still declined, because a price is volatile and the model-call gate is an anchor.

Carry this diagnostic forward: when the market moves, ask first "did an anchor move?" If no anchor moved, the structural decision holds and the review collapses to checking triggers. Refuse to re-litigate slow-moving facts, and refuse to believe a vendor claim until you've re-attached the workload it silently assumes.

Remember:

  • Sort every decision input by half-life: slow anchors carry the decision, fast inputs get triggers.
  • Anchors are data gravity, team size, regulatory boundary, model-call ownership; they rarely move.
  • Re-evaluate on named triggers (price move > 30%, anchor change, gate now passable, volume crosses 10×) plus a six-month calendar look for slow drift.
  • Cadence follows the ratio of switching cost to input volatility; the escape hatch lowers switching cost and frees you to revisit.
  • A vendor claim is true for its unstated workload; re-attach the workload and check it against your anchors before believing it.
  • There is no right platform — only the best current fit on your anchors, held honest by a scheduled review.

Bridge. We close the module where it began: there is no permanently-right platform, only a best current decision anchored on slow-moving facts and kept honest by a scheduled look. That same build-vs-buy judgment — what to own, what to rent, what to configure, and when to revisit — now turns inward. So far we've pointed it at the agents the business ships to customers. The next module points it at your own engineering function: when should GenAI write your code, review your diffs, and run inside your SDLC, and where do the same anchors (control, gravity, the cost of leaving, the gates that override the rubric) decide build-vs-buy for the tools your engineers use every day? → ../23_genai_for_sdlc/00-first-principles.md