09. WebSockets, SSE, and Long Polling — keeping the mailbox open for live updates¶

~15 min read. Some products need fresh updates continuously, not one reply per request.

Built on the ELI5 in 00-eli5.md. The envelope — the packet carrying one message — now stretches into a live channel where updates keep arriving without starting over each time.

1) Why plain request-response feels clumsy for realtime work¶

Normal HTTP request-response is short-lived. Client asks. Server answers. Connection may stay reusable, but the application exchange finishes. That works for page loads and form submits. It feels clumsy for chat, dashboards, and alerts. Why? Because the server often knows the update first. Polling asks repeatedly even when nothing changed. That wastes requests and increases delay. Think of old style as sending a fresh envelope every minute. Realtime tools instead keep the mailbox open. One quick picture. ┌──────────┐ ask again ┌────────────┐ │ Browser │ ──────────────→ │ Server │ polling └──────────┘ ←────────────── │ maybe none │ ┌──────────┐ open channel ┌────────────┐ │ Browser │ ═══════════════→│ Server │ persistent └──────────┘ ←══════════════ │ updates │ So what are the main choices? - WebSocket - Server-Sent Events - long polling Each solves freshness differently.

2) WebSocket keeps one full-duplex path alive¶

WebSocket begins with an HTTP upgrade handshake. Client sends headers asking to switch protocol. Server agrees with 101 Switching Protocols. After that, both sides can send frames any time. This is full duplex. Diagram first. ┌──────────┐ GET + Upgrade ┌────────────┐ │ Client │ ───────────────→ │ Server │ └──────────┘ ←─────────────── │ 101 reply │ │ └────────────┘ ╞══════════ WebSocket frames both ways ══════════╡ Now the good part. Chat messages flow fast. Presence updates flow fast. Games and collaborative editors benefit too. Connection reuse is excellent because one connection stays open. But cost moves elsewhere. Open connections consume memory, file descriptors, and heartbeat traffic. Worked example. Suppose one server can hold 60,000 WebSocket connections safely. Your app expects 240,000 concurrent users at peak. Minimum servers for socket capacity = 240,000 ÷ 60,000 = 4. Now add N+1 safety. If one server fails, 240,000 ÷ 3 = 80,000 connections per remaining server. That exceeds safe limit. So 4 is not enough. Try 5 servers. If one fails, 240,000 ÷ 4 = 60,000 exactly. Still tight. Sensible design may choose 6. Then one failure leaves 240,000 ÷ 5 = 48,000 each. Much better. That is connection math, not only CPU math. WebSocket is wonderful for two-way low-latency interaction. It is not free.

3) SSE is simpler when server talks one-way¶

Server-Sent Events means one persistent HTTP response stream. Server pushes text events to client. Client does not send arbitrary messages back on same channel. So SSE is one-way. That makes it simpler than WebSocket. Especially for browser dashboards and notifications. Picture it. ┌──────────┐ GET /events ┌────────────┐ │ Browser │ ────────────────→ │ Server │ └──────────┘ ←════════════════ │ event stream Typical uses: - stock or metrics dashboards - job progress updates - notification feeds - admin consoles Worked example. A metrics page needs 1 update each second. 10,000 viewers are connected. Each update payload is 220 bytes. Outbound bytes each second = 10,000 × 220 = 2,200,000 bytes. That is about 2.2 MB per second. No repeated request headers are needed each second. That keeps overhead lower than frequent polling. SSE also reconnects fairly gracefully in browsers. But it is text-based and browser-oriented. Binary duplex interaction is not its strength. Think of SSE as many small envelopes, all sliding through one open slot from server to client.

4) Long polling works almost everywhere, but wastes more motion¶

Long polling keeps one request open until data appears or timeout occurs. Server responds once. Client immediately opens another request. So it simulates push using repeated requests. This is older but still useful when infrastructure constraints exist. One picture helps. Client ──request 1──→ Server Client ←─response 1── Server when event arrives or timeout hits Client ──request 2──→ Server Client ←─response 2── Server Now the cost math. Suppose 20,000 clients use long polling. Average poll cycle is 25 seconds. Requests per second ≈ 20,000 ÷ 25 = 800. If each request and response header pair costs 900 bytes, header traffic alone is 800 × 900 = 720,000 bytes per second. That is before actual event payload. Compare with SSE. SSE opens once and keeps streaming. Long polling reopens repeatedly. That means more TLS work, more logs, more LB churn, and more chance of races. But it works with plain HTTP infrastructure. So teams still use it where upgrade paths are limited.

5) Choosing between them with product examples¶

Use WebSocket when both sides talk frequently. Chat is the classic example. Multiplayer coordination is another. Collaborative editing also fits. Use SSE when server mainly pushes updates outward. Live dashboards fit well. Notification toasts fit well. Use long polling when compatibility matters more than elegance. Legacy enterprise proxies sometimes push teams there. One comparison table. ┌──────────────┬────────────────┬────────────────┬──────────────────┐ │ Mechanism │ Direction │ Connection use │ Best fit │ ├──────────────┼────────────────┼────────────────┼──────────────────┤ │ WebSocket │ two-way │ one long-lived │ chat, games │ │ SSE │ server to user │ one long-lived │ dashboards │ │ Long polling │ pseudo push │ many requests │ broad fallback │ └──────────────┴────────────────┴────────────────┴──────────────────┘ One more worked latency example. New event appears at server at time zero. With long polling timeout window of 20 seconds, average wait to next open request may be small, but churn remains constant. With polling every 5 seconds, average freshness lag is about 2.5 seconds. With SSE or WebSocket, lag can be near network latency, say 80 ms. That is why realtime products feel truly live on persistent channels.

6) Persistent channels change load-balancer and server design¶

Once connections stay open, your load balancer must care about connection count, not only request rate. Health checks and draining become more delicate. One server restart can drop thousands of live users. Sticky routing may also matter. Presence state or room membership may live in one process. So the network story is bigger than protocol syntax. It is connection lifecycle management. Keep this summary. - request-response optimizes simplicity - WebSocket optimizes two-way low latency - SSE optimizes one-way streaming simplicity - long polling optimizes compatibility when better channels are unavailable Choose by traffic pattern. Not by fashion.

Where this lives in the wild¶

Slack — realtime platform engineer uses persistent channels so typing indicators, messages, and presence updates feel immediate.
Grafana — dashboard engineer benefits from SSE-like streaming for live charts where server mainly pushes new points outward.
Zerodha — market data engineer cares deeply about connection counts and low-latency streaming for price updates.
GitHub — notifications engineer may use simpler push-friendly channels for job progress and event feeds in browsers.
Uber — dispatch systems engineer chooses the right live transport for driver location, rider status, and operational dashboards.

Pause and recall¶

Why does polling waste work when nothing changes?
What exactly happens during the WebSocket upgrade handshake?
When is SSE a better fit than WebSocket?
Why does persistent connection count matter as much as request rate?

Interview Q&A¶

Q: Why choose WebSocket for chat and not simple polling? A: Chat needs low-latency two-way communication with many small messages. WebSocket keeps one duplex connection open so each message does not pay repeated request setup cost. Common wrong answer to avoid: "Because WebSocket is always faster than HTTP" — the real reason is persistent two-way interaction, not a magical speed guarantee. Q: Why choose SSE for a dashboard instead of WebSocket? A: If updates mainly flow from server to browser, SSE is simpler and fits the one-way stream cleanly. You keep persistent delivery without full duplex complexity. Common wrong answer to avoid: "Because SSE uses less bandwidth in every case" — simplicity and directionality are the main reasons. Q: Why is long polling still used sometimes? A: It works over plain HTTP infrastructure and survives environments where upgrades or persistent streaming support are constrained. Compatibility can outweigh elegance. Common wrong answer to avoid: "Because companies do not know WebSocket" — often the constraint is network environment or platform compatibility. Q: Why does realtime design force new capacity math? A: Open connections consume memory, descriptors, heartbeat traffic, and balancer state even when message volume is low. Capacity must be sized for concurrency, not only request bursts. Common wrong answer to avoid: "Only CPU matters once the app is optimized" — connection footprint can become the first bottleneck.

Apply now (5 min)¶

Exercise: Assume 30,000 users need one-way notifications. Decide between SSE and WebSocket first. Then estimate outbound traffic if each event is 180 bytes and arrives once per second. Next, imagine long polling with a 30-second cycle for same users. Compute approximate requests per second. Finally, write one sentence on why a load balancer must drain carefully for persistent channels. Sketch from memory: Draw one polling loop, one SSE stream, and one WebSocket duplex channel. Under each, write direction, connection count style, and best product example.

Bridge. We now understand live connections on public networks. Next we move inside cloud boundaries and learn how private subnets, NAT, and VPCs shape who can talk to whom. → 10-vpc-and-private-networking.md