07. Wiring the AI into the systems the business actually runs on¶

~17 min read. The model can hear, think, speak, and assist. But a call that doesn't authenticate the caller, doesn't pull the account, doesn't write a disposition, and transfers to a human who knows nothing — never happened, as far as the business is concerned. This chapter is the seam where the AI stops being a voice and starts being a system of record.

Built on 06-post-call-analytics.md. Analytics can only grade what the records show — so the records have to exist, be correct, and hang off the right account. This chapter wires the AI into the CRM and the CTI layer that every earlier chapter quietly assumed: the integration seam named in chapter 01, the disposition that makes a call real, and the warm vs cold transfer baton that decides whether the caller has to re-explain everything.

Note: this is where the running example finally closes its loop. The bot from chapters 02–04 authenticated a caller, the human in chapter 05 was assisted, and chapter 06 graded the result — but all of that depended on a CRM lookup mid-call, a screen pop to the human, and a disposition written back. This chapter is the plumbing that made the previous five chapters possible, examined head-on.

What the live and offline layers assumed but never built¶

Every chapter so far quietly borrowed from the CRM and never paid it back. Chapter 04's bot answered "what's my balance?" — but a balance lives in a billing system the bot had to query, against a caller it had to authenticate first. Chapter 05's assist surfaced "this customer has 2 prior unresolved contacts" — that came from the contact history in the CRM. Chapter 06 graded "did the agent follow the dispute script" against a disposition — a structured record someone had to write. None of those facts are in the audio. They live in Salesforce or Zendesk, behind an API, keyed to an account the AI must first identify.

So the AI is not a standalone voice. It is one more participant in a system that already has a customer record, a case, a contact history, an identity-verification policy, and a wrap-up form — all predating the AI by years. The work of this chapter is the seam: how the AI authenticates the caller mid-call, pulls the account so it and the human can both see it, and writes a clean structured outcome back so the call counts. Get this wrong and you get exactly chapter 00's opening disaster — a fluent bot that "couldn't pull up the account because it never authenticated the caller against the CRM," and a human who "answered a cold call: no transcript, no account, no reason for the transfer."

The pressures here are not latency (mostly) — they are coordination and integration. The AI must stay in sync with systems it doesn't own, over APIs that fail, with identity it must verify before it trusts, and a transfer that must carry state to a human who is a separate process entirely.

By the end you can name the CTI layer (screen pop, call control, agent state), how the AI authenticates and looks up an account mid-call without blowing the turn budget or leaking data, how a disposition gets written back, and why the warm-transfer baton is the single highest-leverage integration in the whole stack.

What this file solves¶

A voice AI can answer fluently and still strand every caller because it never wired into the systems of record: it can't verify who's calling, can't see the account, can't hand a human the context, and can't write the outcome — so the business has a transcript of a call that, on paper, accomplished nothing. This file shows how the AI authenticates the caller against the CRM mid-call, pops the right account and case onto the human's screen, packs the warm-transfer baton (transcript, auth state, intent, account), and writes a structured disposition back to Salesforce or Zendesk — so the call is identified, actioned, handed off cleanly, and recorded.

Why "let the AI just call the CRM API" strands callers and leaks data¶

The obvious build: give the AI a set of tools — lookup_account, get_balance, take_payment, transfer — point them at the CRM's REST API, and let the model call them as the conversation flows. Every piece works in isolation. A smart team ships this in a week.

It fails on contact with a real call, three ways. First, identity: the bot calls get_balance for a phone number, but a phone number is not an authenticated identity — anyone can spoof caller ID, and now the bot has read a stranger their victim's balance. Second, the handoff: the bot decides to transfer, calls transfer("tier2"), and the call lands on a human with nothing — no transcript, no account, no reason — because "transfer" in telephony is just bridging a call leg; it carries no application state unless you explicitly attach it. The human says "how can I help you?" and the caller, who just spent four minutes explaining, snaps. Third, the record: the call ends, the bot has no concept of "wrap-up," so no disposition is written — and to analytics (chapter 06), to the next agent, and to the audit, the call is a void.

So the real problem is not "the AI needs more tools." It is that a CRM/CTI integration is not a set of API calls — it is a stateful contract: who is verified, what's on screen, what gets carried across a transfer, and what gets written back. A tool that returns account data is useless if the caller isn't verified; a transfer that doesn't carry state produces a cold handoff; a call without a disposition didn't happen. How does the AI participate in that contract — verifying before it trusts, carrying state across the seam, and recording the outcome — instead of just firing isolated calls?

That question shapes the whole chapter. Authenticate before you expose account data. Treat the transfer as a baton handoff, not a call-leg bridge. Treat the disposition as the call's only durable artifact. This is what CTI was built for: Salesforce Service Cloud Voice pops the account on call connect, carries call context through Omni-Channel routing, and (with Copilot) drafts the disposition the agent confirms — the same shape, productized.

Rule: verify before you expose, carry state across the seam, record before you hang up¶

The load-bearing rule of CRM/CTI integration: the AI must authenticate the caller before exposing any account data, carry full context across every transfer (warm, not cold), and write a structured disposition before the call ends — because account data without verified identity is a breach, a transfer without state is a cold handoff, and a call without a disposition is a void. Integration correctness is a state contract, not a pile of API calls.

Why this rule exists. The primitive is that the CRM is the system of record and the call is a transient process touching it. The constraints are three: identity must gate data (an unverified phone number is not a person), a transfer crosses a process boundary that drops application state by default, and the call's only durable trace is what you write back. Each clause of the rule plugs one gap — verify gates the data, carry-state defeats the cold handoff, record defeats the void. Skip any one and you reproduce chapter 00's opening failure.

1) The CTI layer — screen pop, call control, agent state, and where the AI plugs in¶

CTI (computer-telephony integration) is the bridge between the phone system and the CRM. It does three jobs, and the AI uses all three.

        TELEPHONY (ch 02)            CTI LAYER                CRM (system of record)
   ┌──────────────────────┐   ┌────────────────────────┐   ┌─────────────────────────┐
   │ call arrives, caller  │   │ 1. SCREEN POP          │   │ Account / Contact        │
   │ ID + IVR digits + DNIS │──▶│    identify → open the │──▶│ Case / Ticket            │
   │                       │   │    matching record      │   │ Contact history          │
   │                       │   │ 2. CALL CONTROL        │   │ Disposition fields       │
   │                       │◀──│    answer/hold/xfer/    │◀──│ (Call Resolution,        │
   │                       │   │    conference via API   │   │  Description, outcome)   │
   │                       │   │ 3. AGENT STATE         │   │                          │
   │                       │   │    Available/On-Call/   │   │                          │
   │                       │   │    After-Call-Work      │   │                          │
   └──────────────────────┘   └────────────────────────┘   └─────────────────────────┘
              ▲                          ▲
              │                          │ the AI is just another "agent"
              └──────────  AI VOICE AGENT ── on this layer ──┘

Screen pop: on call connect, CTI matches the caller (by ANI, IVR-entered account number, or an authenticated token) and opens the matching record. For a human, this is the account appearing on screen before they say hello. For the AI, the "pop" is the same lookup — the AI fetches the account into its working context so it can answer grounded questions instead of guessing. Call control: answer, hold, transfer, conference — exposed as API verbs the AI calls instead of pressing buttons. Agent state: the AI, like a human, has a status — On-Call, After-Call-Work — so routing (chapter 01's ACD) knows it's busy and doesn't send it a second call mid-wrap-up.

The key reframing: the AI is just another agent on the CTI layer. It pops records, controls the call, and reports state through the same APIs a human desktop uses. This is why Service Cloud Voice and the Talk Partner Edition CTI in Zendesk can host a bot or a human behind the same screen-pop and call-control plumbing — the integration contract is identical.

For the billing line, the screen pop is the account behind "what's my balance?" — but the AI must not read that balance until the next section's gate is passed.

2) Picture: the AI as a clerk at a counter, not a voice in a vacuum¶

The mental model that keeps integration honest: the AI is a clerk standing at the CRM counter. Before it hands over account details, it checks ID. While it works, the customer's file is open on the counter in front of it. When it passes the customer to a colleague, it slides the whole open file across — not an empty desk. And before it closes, it stamps the file with what happened.

                       ┌─────────────────────────────────────┐
   CALLER  ──speaks──▶ │   AI CLERK at the CRM counter        │
                       │                                      │
                       │   1. "Verify you first" ── ID gate ──┼──▶ auth check
                       │   2. file open on counter ───────────┼──▶ account/case
                       │   3. slide WHOLE file to colleague ──┼──▶ warm transfer
                       │   4. stamp the file before closing ──┼──▶ disposition
                       └─────────────────────────────────────┘

   Cold handoff = sliding an EMPTY desk to the colleague (caller re-explains)
   No disposition = the file is never stamped (the visit didn't happen)

Contrast the two failure shapes the clerk avoids. The cold transfer is sliding an empty desk — the colleague has the customer but not the file, so the customer re-explains from scratch (chapter 00's one-star handoff). The void is never stamping the file — the visit leaves no trace, so analytics (chapter 06) can't grade it and the next call starts blind. Both are integration failures, not model failures. The model spoke fine; the clerk forgot to check ID, slide the file, or stamp it.

3) The running example: the billing call's full integration loop¶

Thread the whole call through the seam, end to end. The duplicate-charge caller from chapters 02–05 — here is everything that touched the CRM.

Attempt A — fire tools, no contract¶

The bot answers, hears "what's my balance," calls get_balance(ani=+1555...), and reads it out. The caller then says "and I want to dispute a charge" — the bot decides this is complex and calls transfer("tier2"). The call bridges to a human who sees an empty screen and says "Hi, how can I help?" The caller, who just authenticated by phone number (which they didn't, really) and explained the dispute, has to start over. The call ends; no disposition is written. Result: a possible data exposure (balance read to an unverified caller), a cold transfer, and a void record. Three breaches of the one rule.

Attempt B — verify, carry, record¶

The bot answers and pops the account by ANI as a candidate match — not yet trusted. Before reading any balance, it runs the auth gate (section 4): "to verify your identity, what are the last four of the card on file and your billing ZIP?" Only on a pass does it expose the balance. The caller says "I want to dispute the duplicate charge on the third and pay the rest." The bot handles the payment via the PCI-safe path (chapter 08), but the dispute is genuinely complex, so it transfers — and the transfer is warm: it packs the baton (transcript so far, auth=verified, account ID, intent="duplicate-charge dispute", the $59 amount) and attaches it to the call so it screen-pops on the tier-2 human's desktop. The human opens to the account, the transcript, and "verified caller, duplicate-charge dispute, $59" — and says "I see the duplicate $59 charge on the third, let me sort that out." When the call ends, the AI drafts the disposition — outcome code BILLING_DISPUTE_CREDIT, summary, follow-up flag — into the Voice Call record's Call Resolution and Description fields, which the human confirms.

The hard part hiding here: the baton. A telephony transfer carries a call leg, not application state. To make it warm, you attach context to a thing that survives the transfer — the CTI's call object, a contact-attributes payload, or a CRM case the human's screen pops to. Get the baton right and the human starts where the AI left off; drop it and you're back in Attempt A. This is the warm vs cold transfer pressure from chapter 01, now made concrete: warm = the file slides across; cold = the empty desk.

4) Authenticating the caller mid-call without blowing the turn budget or the blast radius¶

The seam everyone underestimates: identity. A caller ID is a claim, not a verified identity — it's trivially spoofable, and a household phone may have five people on it. Exposing account data to an unverified ANI is the breach in Attempt A.

The naive fix is to ask a pile of security questions, but that fights the turn budget (chapter 04) — every question is a turn — and the blast radius (chapter 01): the more an unverified caller can probe, the worse a successful social-engineer attack. So authentication is a graded gate, matched to what the caller wants to do.

Knowledge-based (last-four, ZIP, DOB)¶

Helps: works on any phone, no enrollment. Hurts: the exact data attackers harvest from breaches; slow (multiple turns); fails for legitimate callers who forget.

Use when: low-sensitivity actions (check balance), as a fallback, or where stronger methods aren't enrolled.

One-time passcode (SMS/email to the number on file)¶

Helps: proves possession of an enrolled channel; fast; hard to social-engineer.

Hurts: requires the channel on file to be current; adds a turn and an out-of-band dependency.

Use when: account changes, payments, anything raising blast radius.

Voice biometrics (passive, on the enrolled voiceprint)¶

Helps: passive (no extra turns — verifies as the caller talks naturally), strong, low friction.

Hurts: enrollment cost, false-reject on illness/noise, and a deepfake surface that's growing fast (chapter 09).

Use when: high call volume where friction matters and the voiceprint is enrolled.

The discipline: gate the data behind the right strength for the action's blast radius. Reading a balance to a verified ANI + last-four is fine; changing the address or taking a payment demands an OTP or biometric. The AI must refuse to expose account data until the gate for that action passes — the verify-before-expose clause of the rule. For the billing line, balance needs last-four + ZIP; the payment escalates to OTP; and the bot never reads the balance on the failed gate, it offers to retry or transfers to a human.

Teacher voice. Authentication is not a login step you do once at the top. It is a gate per action, sized to the action's blast radius. The same call can read a balance after a weak check and still demand a strong check before it moves money. Treat identity as a level the caller climbs, not a door they walk through.

5) The property that changes the design: the CRM is a system you don't own and that fails¶

The dimension teams get wrong: they treat the CRM API as if it's local, reliable, and instantaneous. It is none of those. It's a remote system you don't control, with rate limits, latency spikes, partial outages, and its own change windows. Every lookup the AI makes mid-call is a network call that can be slow, fail, or return stale data — inside the turn budget.

   Assumed:  AI ── instant, always-up ──▶ CRM        (works in the demo)
   Reality:  AI ── 200ms p50 / 2s p99 / sometimes 503 ──▶ CRM

   A get_balance that takes 2s eats the turn budget → dead air.
   A get_balance that 503s mid-call → the bot has nothing to say.

This asymmetry should change the design two ways. First, decouple the lookup from the turn: fetch the account on call-connect (the screen pop) and cache it, so when the caller asks for the balance the AI answers from cached context instead of a fresh call inside the turn budget. Pre-fetch on connect, don't fetch on question. Second, design the failure path: if the CRM is down or slow, the AI must degrade gracefully — "I'm having trouble pulling your account, let me connect you to someone who can help" (a warm transfer with whatever context exists) — not stall in dead air or invent a balance. The CRM being a fallible remote dependency is why the fallback-to-human path (chapter 04) and the warm baton (section 3) are not optional extras; they're the failure handler for the integration seam.

For the billing line: the account is popped and cached on connect, so "what's my balance" is answered locally; but a write (logging the dispute) that fails is queued and retried, never silently dropped — because a lost disposition is a void.

6) One failure walked through: the double-charge from the retried disposition write¶

Incident: customers on the billing line occasionally see two dispute cases opened for one call, or a payment logged twice. The AI's logic looks correct. The CRM logs look correct — each write succeeded.

The chain: the AI writes the disposition to the CRM at call end. The CRM is briefly slow, the write takes 4 seconds, the AI's HTTP client times out at 3 seconds and retries. The first write actually succeeded server-side; the retry creates a second record. Now there are two dispute cases (or two payment logs) for one call. The AI did everything "right" — it retried a timed-out call, which is correct behavior for a failed call. The bug is that the call wasn't failed; it was slow, and the write was not idempotent.

The root cause is not a bad retry policy — retrying timeouts is correct. It's that a non-idempotent write retried after a timeout duplicates, because a timeout means "unknown," not "failed." The fix: make the write idempotent — attach an idempotency key (the call ID + a write sequence) so the CRM treats a retry of the same logical write as a no-op, returning the existing record instead of creating a new one. Salesforce and most CRMs support this via external-ID upserts or duplicate-detection keys. This is the same at-least-once-delivery-meets-non-idempotent-handler failure that produces duplicate messages in any distributed system (chapter 08's audit trail, and the streaming platform ahead) — the integration seam is a distributed-systems problem wearing a CRM costume.

Mini-FAQ. "Why not just stop retrying to avoid duplicates?" Because then a genuinely failed write silently drops the disposition — a void record, the other half of the rule. You need both: retry (so failures don't drop) and idempotency (so retries don't duplicate). Dropping either reintroduces a failure mode.

7) Cost and effort movement: what the integration buys, what it costs¶

Effects of doing the CRM/CTI seam properly versus the naive "fire tools" build (illustrative; varies by stack):

What it does	What it buys	What it costs	Who absorbs the cost
Pre-fetch account on connect (screen pop)	balance answered from cache, no in-turn CRM call	a wasted lookup on abandoned calls	CRM API quota
Auth gate sized to blast radius	data exposed only to verified callers	extra turns; some legit-caller friction	the turn budget + caller patience
Warm-transfer baton	human starts where AI left off; no re-explain	engineering the context payload + screen pop	integration code, CTI config
Idempotent disposition write	no duplicate cases/payments on retry	idempotency keys + upsert plumbing	a little write complexity
Disposition / wrap-up write-back	the call counts; analytics + audit can see it	a write per call; review burden (ch 05)	CRM storage, agent review

The pressure evolution: wiring the seam relieves the void-and-cold-handoff failures (every call is now identified, handed off warm, and recorded) but creates a coordination dependency on systems you don't own — the AI's correctness now hinges on a remote CRM's availability, latency, and idempotency semantics, absorbed by caching, retries-with-idempotency, and the graceful-degradation fallback. The auth gate relieves the breach risk but creates turn-budget and friction pressure, absorbed by the caller and by smart method selection (passive biometrics cost no turns). You're trading "fluent but stranding" for "correct but coupled."

8) Signals that the integration seam is the problem¶

Healthy: account pops before the AI answers grounded questions, auth passes for legit callers in one or two turns, transfers land on humans who don't ask the caller to repeat themselves, every completed call has exactly one disposition, and write-success rate to the CRM is ~100%.

First metric to degrade: transfer re-explain rate — the share of transferred calls where the human re-asks the caller for information the AI already had. When the baton breaks (a code change drops a context field, a screen-pop misfires), this rises before anyone notices, because the call still completes; it just completes painfully. AHT on transferred calls and post-transfer CSAT degrade with it.

Misleading metric people watch: CRM API success rate. It can be 100% — every write succeeded — while you're silently creating duplicate records (section 6). "All writes succeeded" says nothing about whether each logical call produced exactly one record.

First graph an expert opens: dispositions-per-call (should be exactly 1.0; a drift above 1.0 means duplicate writes, below means voids), and CRM-lookup latency p99 against the turn budget (a p99 creeping toward the budget means dead air is coming). The second graph: auth pass rate split by method and by legit-vs-fraud outcome — a falling legit-caller pass rate means the gate is too aggressive and you're transferring good callers; a rising fraud pass rate means it's too weak.

9) Boundary: where tight CRM integration shines, where it becomes brittle¶

Tight CRM/CTI integration shines on transactional, account-bound calls — billing, orders, account changes, support tickets — where the value is the account context: who's calling, what they own, what happened before, and recording what happened next. Here the screen pop, the baton, and the disposition are the whole job, and a clean seam is the difference between a bot that operates the business and one that just talks.

It becomes brittle when the AI is coupled to too many systems, each with its own auth, latency, and failure mode — a call that must touch billing, the order system, the loyalty platform, and a third-party warranty API is only as reliable as the flakiest of them, and each is a fresh way to stall mid-turn. The scale limit that inverts intuition: more integrations make the bot more capable but less reliable — each system added widens what the bot can do and widens the surface of remote failures inside the turn budget. Past a few synchronous dependencies, you must move lookups off the critical path (pre-fetch, cache, async-with-fallback) or the bot's reliability degrades faster than its capability grows. The instinct to "just integrate one more system" is exactly backwards under latency pressure.

10) Wrong assumption: "the transfer hands off the call, so the context goes with it"¶

The seductive idea: when the AI transfers to a human, the call goes to the human, so naturally the context goes too — the human can see what happened. This is false and it's the single most common contact-center-AI failure. A telephony transfer bridges an audio leg. It carries no transcript, no account, no auth state, no intent — nothing the application knows — unless you explicitly attach and pop it. The default transfer is cold. The context does not ride along for free.

Replace it with: a transfer carries only the audio unless you build a baton; warm transfer is engineered, cold transfer is the default. This reframing changes where you spend effort: the warm-transfer baton (transcript + account + auth + intent attached to the call object and screen-popped to the human) is not a polish feature — it's the load-bearing integration that turns "the bot couldn't handle it" from a one-star re-explain into a smooth escalation. It's also why chapter 05's assist panel could show the human the transcript and account: that context arrived because the baton carried it, not because the transfer magically did.

11) Other ways the integration seam bites¶

Cold transfer — context not attached; human starts blind; caller re-explains (chapter 00's disaster).
Duplicate records — non-idempotent write retried after a timeout; two cases or two payments per call (section 6).
Void disposition — call ends with no write; analytics (ch 06) and audit can't see it; the call "didn't happen."
Unverified data exposure — balance/account read to a spoofed ANI before the auth gate; a reportable breach.
Stale screen pop — cached account is out of date; the AI answers from a stale balance the customer already paid.
CRM-down dead air — a lookup stalls inside the turn budget; the bot freezes instead of degrading to a warm transfer.
Wrong-record pop — ANI matches the wrong account (shared phone, ported number); the AI discusses someone else's bill.
Disposition rubber-stamp — agent confirms an AI-drafted wrap-up without reading; wrong outcome poisons analytics (ch 05, ch 06).
Over-coupling — too many synchronous integrations; one flaky third-party API stalls every call that touches it (section 9).

12) Pattern transfer¶

The warm-transfer baton is process handoff with state, not just a connection — structurally identical to passing request context across a service boundary (trace ID, auth token, session) so the next service doesn't start blind. A cold transfer is dropping the context object at the RPC boundary; the baton is propagating it. Same failure geometry as losing the trace context across an async hop.
Idempotent disposition writes are at-least-once delivery meeting a non-idempotent handler — the duplicate-case bug (section 6) is the same shape as a message queue redelivering a message to a handler that isn't idempotent. The fix is the same: an idempotency key so retries are no-ops. This recurs in chapter 08's audit log and again in the streaming platform ahead.
Pre-fetch-and-cache the account is moving work off the critical path — same as warming a cache or prefetching before the latency-sensitive operation, exactly the chapter 04 instinct of hiding latency by not doing the slow thing inside the deadline. The turn budget and the CRM's p99 are the two clocks you're keeping apart.

13) Design test¶

Does the AI authenticate the caller — with strength matched to the action's blast radius — before exposing any account data?
Is every transfer warm: does the human's screen pop to the account, transcript, auth state, and intent, so the caller never re-explains?
Is every disposition write idempotent (keyed) and retried, so a slow CRM produces neither a duplicate nor a void?
Are account lookups pre-fetched on connect and cached, off the turn-budget critical path, with a graceful fallback when the CRM is slow or down?
Does every completed call produce exactly one disposition, and do you track dispositions-per-call to catch duplicates and voids?

Where this appears in production¶

CTI and screen pop

Salesforce Service Cloud Voice — unifies telephony and CRM; the standard Voice Call object screen-pops on connect with IVR, caller, lifecycle, transcript, and disposition data; Omni-Channel routes the call with context attached.
Salesforce Open CTI — the legacy screen-pop/call-control API, end-of-life 28 Feb 2028; new builds use Service Cloud Voice's contact-center plumbing instead.
Zendesk Talk Partner Edition (CTI) — "pop a ticket" actions open the matching ticket to the agent on answer or after a transfer; recording_URL attaches the recording to a voice comment.
Amazon Connect + CRM (Open CTI / connectors) — contact attributes carry context through routing and pop the matching record via a Lightning component.
Genesys Cloud agent desktop — screen pop and call control unified with the CRM record for the agent (or bot) handling the contact.
NovelVox / Tenfold / Talkdesk connectors — CTI adapters that pop the CRM record and map agent state for Cisco, Avaya, Genesys behind Salesforce/Zendesk.

Auth, baton, and write-back

Voice biometrics (Pindrop, Nuance Gatekeeper) — passive voiceprint verification that authenticates as the caller talks, spending no turns.
OTP step-up (SMS/email to channel on file) — proves possession before high-blast-radius actions like payments or address changes.
Knowledge-based verification (last-four, ZIP, DOB) — the weak fallback gate for low-sensitivity lookups.
Contact-attributes / call-object payload — the warm-transfer baton: transcript, account ID, auth state, and intent attached to the call so it survives the transfer.
Salesforce external-ID upsert / duplicate rules — idempotent disposition writes keyed on the call ID so a retried write doesn't create a second case.
Service Cloud Voice Copilot disposition + summary — auto-populates Call Resolution and Description fields with a draft the agent confirms (chapter 05's auto-summary, written to the record).
Zendesk ticket disposition / macros — the structured outcome and wrap-up written to the ticket so the contact counts.
Disposition / wrap-up codes — the structured outcome that makes the call visible to analytics (ch 06) and audit (ch 08).
Pre-fetch-on-connect caching — the account loaded at call start and cached so balance questions are answered off the turn-budget critical path.

Recall¶

Why is "give the AI CRM tools and let it call them" not enough — what three gaps does it leave?
What three things must the AI do, per the chapter's rule, and which failure does each prevent?
Why is a caller ID not an authenticated identity, and how is the auth gate sized to the action?
Why does a telephony transfer not carry context by default, and what is the warm-transfer baton?
How does a successful CRM write still produce duplicate records, and what fixes it?
Why must account lookups be pre-fetched on connect rather than fetched when the caller asks?
Why does adding more system integrations make a bot more capable but less reliable?

Interview Q&A¶

Q1. Your voice bot answers fluently but transferred callers complain they have to repeat everything. Where's the bug? The transfer is cold — the bot bridges the audio leg but attaches no context, so the human starts blind. A telephony transfer carries no application state by default; you have to engineer a warm-transfer baton: attach the transcript, account ID, auth state, and intent to the call object so the human's screen pops to it. The re-explain rate is the metric that surfaces this, and it stays hidden because the call still completes — it just completes painfully. Common wrong answer to avoid: "the transfer routes to the call, so the context goes with it" — that's exactly the false assumption; the transfer carries audio only, the baton is built, not free.

Q2. Should the AI fetch the account when the caller asks for their balance, or earlier? Earlier — pre-fetch on call connect (the screen pop) and cache it, so the balance is answered from local context, not a fresh CRM call inside the turn budget. The CRM is a remote system with a slow p99 and occasional outages; a lookup on the question can stall into dead air. Pre-fetching moves the slow work off the latency-critical path, the same hide-the-latency instinct as the turn-budget chapter. And design the fallback: if the CRM is down, degrade to a warm transfer, don't freeze or invent a balance. Common wrong answer to avoid: "fetch on demand, it's simpler" — on-demand lookups put a fallible 2s-p99 dependency inside an 800ms turn budget; that's how you get dead air mid-call.

Q3. How should the AI authenticate a caller, and is one check at the top enough? No single check at the top. Authenticate per action, sized to the action's blast radius: a weak gate (last-four + ZIP) for reading a balance, a stronger one (OTP or voice biometrics) before moving money or changing the account. A caller ID is a spoofable claim, not an identity, so the AI must refuse to expose account data until the gate for that action passes. Passive voice biometrics are attractive because they verify as the caller talks and cost no turns. Common wrong answer to avoid: "verify once at the start, then trust the caller for the whole call" — blast radius rises mid-call (balance check vs payment); the gate must rise with it.

Q4. CRM API success rate is 100% but you're seeing duplicate dispute cases. Explain it. A timeout is "unknown," not "failed." The write succeeded server-side but took longer than the client timeout, the client retried, and a non-idempotent write created a second record — every write "succeeded," so the success-rate dashboard is green while dispositions-per-call drifts above 1.0. The fix is an idempotency key (call ID + sequence, via external-ID upsert) so the retry is a no-op. You keep the retry (so genuine failures don't drop into voids) and add idempotency (so retries don't duplicate). Common wrong answer to avoid: "stop retrying to prevent duplicates" — then a genuinely failed write silently drops the disposition, producing a void; you need both retry and idempotency.

Q5. Why is "just integrate one more backend system" risky for a voice bot? Each synchronous integration widens capability but adds a remote dependency with its own latency and failure modes inside the turn budget — the bot becomes only as reliable as its flakiest backend, and reliability degrades faster than capability grows. Past a few dependencies you must move lookups off the critical path: pre-fetch, cache, or go async with a graceful fallback. The instinct to keep adding synchronous integrations is backwards under latency pressure. Common wrong answer to avoid: "more integrations just make it more capable" — capability rises but so does the surface of in-turn remote failures; reliability is gated by the worst dependency.

Q6. The AI-drafted disposition is live, but analytics (chapter 06) shows worsening data quality. How are these connected? The disposition is the call's only durable artifact and the input to analytics. If agents rubber-stamp AI-drafted dispositions without reading (chapter 05's review-burden failure), wrong outcomes flow straight into the CRM and then into chapter 06's analytics as confident garbage. The write-back seam and the analytics layer are the same data crossing a boundary: a wrong disposition here is a wrong analytics input there. Track the disposition edit rate; near-zero means rubber-stamping, not perfection. Common wrong answer to avoid: "the analytics model regressed" — the analytics may be fine; the source dispositions degraded because the human review step collapsed.

Q7. (Cumulative) A caller's payment failed to log, but the dispute case was created twice and the balance was read to a spoofed number. Which parts of the rule broke? All three clauses. Reading the balance to a spoofed ANI broke verify-before-expose (no auth gate sized to the lookup). The dispute case created twice broke the record clause via a non-idempotent retried write (section 6). The payment that failed to log is the other half of the record clause — a write that wasn't retried, producing a void. One call managed to breach identity, duplicate a record, and void a record: the integration contract wasn't a contract at all, just isolated tool calls (Attempt A). Common wrong answer to avoid: "these are three unrelated bugs" — they're three faces of the one rule (verify, carry, record); the root cause is treating integration as API calls instead of a state contract.

Design/debug exercise (10 min)¶

Step 1 — Modeled example. Walk the billing call's full loop (section 3, Attempt B): pop-as-candidate on connect → auth gate (last-four + ZIP for balance, step-up OTP for payment) → answer from cache → warm-transfer baton (transcript + account + auth + intent + $59) screen-popped to tier-2 → idempotent disposition write at end. For each step, write the one failure if it's skipped (skip the baton → cold transfer; skip idempotency → duplicate case).

Step 2 — Your turn. A different account-bound call: "I want to change the address on my account and add a line." Design the integration: what auth strength does each action demand (address change vs adding a line), what gets pre-fetched on connect, what does the warm baton carry if this escalates to a human, and what disposition is written. Note where a CRM outage forces a graceful fallback and what the bot says.

Step 3 — Reproduce from memory. Redraw the "clerk at the CRM counter" diagram (section 2) cold — ID gate, open file, slide the whole file, stamp before closing — and label the cold-transfer and void failure shapes. Then connect to chapter 05: show that the assist panel could display the transcript and account only because the warm baton carried them across the transfer.

Operational memory¶

This chapter explained why a fluent voice bot can still strand every caller and leave the business with nothing: it answered, but it never verified who was calling, never carried context to the human, and never wrote the outcome — so the call exposed data it shouldn't, handed off cold, and left a void. The important idea is that CRM/CTI integration is a state contract — verify before you expose, carry state across the transfer, record before you hang up — not a pile of isolated API calls.

You learned to pre-fetch and pop the account on connect, gate data behind an auth check sized to the action's blast radius, pack a warm-transfer baton (transcript, account, auth, intent) that survives the transfer and screen-pops to the human, and write an idempotent disposition so a slow CRM produces neither a duplicate nor a void. That solves chapter 00's opening disaster because the failure was never the model — it was the missing seam to the systems of record.

Carry this diagnostic forward: when transferred callers re-explain, the baton broke — check the transfer payload before the model. When records duplicate while every write "succeeds," it's a non-idempotent retry of a slow (not failed) write — add an idempotency key. When the bot freezes mid-call, a CRM lookup stalled inside the turn budget — move it off the critical path and add a fallback.

Remember:

Integration is a state contract: verify before you expose, carry state across the seam, record before you hang up.
A caller ID is a claim, not an identity; gate data with auth strength matched to the action's blast radius.
A transfer carries audio only — warm transfer is engineered (the baton), cold transfer is the default.
Retry writes and make them idempotent; a timeout is "unknown," so retry-without-idempotency duplicates.
The CRM is a fallible remote system; pre-fetch on connect, cache, and degrade gracefully when it's slow or down.

Bridge. Every clause of the integration contract leaned on something we kept deferring: the AI authenticated callers, read account data, attached transcripts to transfers, and wrote dispositions — but what is the AI legally allowed to hear, store, and record? The moment the caller reads a card number aloud, or the call crosses into a two-party-consent state, the constraint stops being correctness and becomes law. Compliance, recording consent, and PCI scope are the next seam — and they retroactively constrain everything we just built. → 08-compliance-recording-and-pii.md