Skip to content

13. Honest admission — what async Python still does not make easy

~16 min read. Async Python is powerful, but several pain points remain genuinely hard even for strong teams.

Built on the ELI5 in 00-eli5.md. The kitchen lane — our scheduling picture — is useful, but real kitchens still face hidden complexity, noisy alarms, and awkward tools.


The colored function problem is still real

Look at the first problem plainly. Once one layer becomes async, nearby layers often must become async too. That is the colored function problem. An async function cannot always be called from plain sync code without adaptation.

sync code ──→ async dependency
    └── mismatch appears

See. This leaks through architecture. A sync library may force threadpools. An async dependency may push await upward through several service layers. The color spreads. That is not just beginner pain. It is a real design constraint.

Worked example. Suppose a sync billing service wants usage from an async LLM client. Now we either push await upward through the billing path, or add an adapter layer with threads or queues. The mismatch is real. It does not disappear by naming conventions.

Debugging async timing bugs stays difficult

Now what is the problem? A request sometimes hangs. Sometimes not. A child task fails, but only under load. A stream leaks, but only when the client disconnects mid-token. These are timing bugs. They are harder than direct exceptions.

works alone
   └── fails under concurrency
            ├── race
            ├── missing await
            ├── leaked task
            └── cancellation timing bug

Tracing helps. Structured logs help. Task ownership helps. Still, async debugging remains cognitively heavy. The order ticket moves across many awaits. The state is spread across time. That is fundamentally harder to inspect.

Cancellation semantics are useful but subtle

We taught cancellation as a clean design tool. That is true. But honest engineers admit something else. Cancellation can interrupt code in awkward places. Libraries differ in how well they cooperate. Cleanup paths are easy to miss.

Suppose one task holds a lock. Cancellation arrives. Did the lock release? Did the upstream stream close? Did the partial output leave your database half-written? This is why finally matters so much. But even with discipline, subtle cancellation bugs remain common.

The cancel bell is necessary. It is not simple. Simple, no?

Async does not remove CPU and GIL limits

Another honest point. Async improves I/O concurrency. It does not erase CPU realities. Python still has the GIL for many workloads. Heavy tokenization, OCR, PDF parsing, and vector math may need processes, compiled code, or external services.

Some teams over-async everything, then wonder why CPU saturation remains. The answer is boring. Wrong tool. The line cook is still one cook on one stove for CPU-bound work. The prep shelf and worker processes still matter.

Ecosystem quality is uneven

Async Python today is much better than before. Still, library support remains mixed. One dependency is async-native. Another is sync-only. Another says async, but hides blocking work inside. This creates sharp edges.

dependency stack
├── async HTTP client         good
├── async DB driver           good
├── sync PDF parser           awkward
└── unknown vendor SDK        risky

So what to do? Audit dependencies. Benchmark under realistic load. Use threadpools as bridges, not excuses. That is still the state of the world.

Observability for async systems is improving, not solved.

Correlation ids help. Distributed tracing helps. Metrics help. Still, following one request across child tasks, streams, retries, and cancellations is not always pleasant.

A stack trace shows one moment. Async failures often live in relationships between moments. That makes tooling harder. It is better than before. It is not effortless. Strong teams invest in traces because intuition alone stops working.

Senior honesty in interviews.

If an interviewer asks, "What is still hard about async Python?" you should not act like everything is solved. Say this clearly. Async is excellent for I/O-heavy services. But it increases conceptual load, complicates debugging, does not solve CPU-bound work, and depends heavily on library quality and cancellation discipline.

That is the mature answer. Balanced. Useful. Grounded. No drama.

The kitchen lane remains a powerful picture. But real kitchens still face blind corners, misheard bells, and equipment that does not behave uniformly. See. That is honest engineering.


Where this lives in the wild

  • OpenAI-scale API operations — reliability engineer: cancellation, tracing, and cost leaks become difficult at very high stream concurrency.
  • Anthropic platform team — backend engineer: debugging intermittent async hangs requires traces, not just stack traces and hope.
  • Perplexity ingestion stack — infrastructure engineer: mixed async and sync libraries make real pipelines messier than whiteboard designs.
  • Enterprise AI platform — staff engineer: CPU-heavy parsing still needs workers and separate services despite an async serving layer.
  • Realtime collaboration product — principal engineer: socket lifecycles, backpressure, and reconnect logic keep async code operationally nontrivial.

Pause and recall

  • What is the colored function problem in practical terms?
  • Why are timing bugs in async systems harder than ordinary single-line exceptions?
  • Why does async not solve CPU-bound workloads by itself?
  • In the analogy, what kinds of real-kitchen mess remain even with a good kitchen lane?

Interview Q&A

Q: Why is the colored function problem still relevant in modern Python codebases? A: Because async boundaries propagate through call graphs, forcing architectural choices and adapters wherever sync and async worlds meet. Common wrong answer to avoid: "It disappeared once FastAPI became popular."

Q: Why are async timing bugs often harder to debug than synchronous exceptions? A: Their failure depends on ordering, cancellation, and scheduler interaction across time, so the bug may vanish when observed in a simpler environment. Common wrong answer to avoid: "Because stack traces do not exist in async Python."

Q: Why does async not make CPU-heavy AI preprocessing magically scalable? A: Async primarily improves waiting behavior, while CPU-bound work still contends for cores, the GIL, and process-level resources. Common wrong answer to avoid: "Because await spreads CPU work across cores automatically."

Q: Why should a senior engineer stay cautious about async ecosystem claims? A: Library quality, hidden blocking calls, cancellation behavior, and observability support vary widely, so design assumptions must be verified under load. Common wrong answer to avoid: "If a library advertises async support, production behavior is guaranteed to be safe."


Apply now (5 min)

Exercise. Write down one async pain point you have seen, or imagine one from this module. Classify it as architecture, debugging, cancellation, CPU, or ecosystem mismatch. Then write one mitigation.

Sketch from memory. Draw one clean kitchen lane and three messy arrows: a blocking library, a lost child task, and a CPU-heavy job. Mark where the model breaks.


Bridge. Async APIs handle the serving lane. Next we move to storage for retrieval workloads, where embeddings and nearest-neighbor search live. → ../../03_vector_retrieval_infrastructure/00-eli5.md