Skip to content

05. HTTP Versions — three generations of internet envelopes

~14 min read. Same web idea, different wire formats, very different latency.

Built on the ELI5 in 00-eli5.md. The envelope — the format carrying web messages — changed across HTTP generations for speed.


1) HTTP/1.1 is simple, text-based, and often connection-hungry

HTTP/1.1 requests are human-readable text. That makes debugging easy, but framing less efficient. A simplified interview model treats it as one request per connection. Real browsers added keep-alive and several parallel connections later. Even then, limits per host created waiting. One slow response could delay later useful work. That is head-of-line blocking at the application pattern level. A sketch shows the pain. ┌────────────┐ conn 1 ┌────────────┐ │ Browser │ ──────────────────▶ │ Server │ │ │ ◀────────────────── │ HTML │ │ │ conn 2 │ CSS │ │ │ ──────────────────▶ │ JS │ │ │ ◀────────────────── │ Image │ └────────────┘ └────────────┘ Suppose a page needs HTML, CSS, JS, and three images. Browser opens six TCP connections to the same host. Handshake and TLS may repeat on cold connections. If each new connection costs 60 ms setup, those connections burn noticeable latency before payload transfer. Text headers also repeat heavily. Cookies may travel on every request again and again. Browsers once used domain sharding to increase parallelism. Static assets moved to several hostnames like img1 and img2. That worked around per-host connection limits. But it also increased DNS, TCP, and TLS overhead. HTTP/2 reduced the need for that trick. So some old performance advice became outdated. Keep-alive helped too, but only within each connection. Idle connection reuse was better than reconnecting every time.

2) HTTP/2 keeps one connection busy with many streams

HTTP/2 changes framing, not basic web semantics. Methods, paths, headers, and status codes still exist. But the wire format becomes binary, not textual. Multiple streams share one TCP connection concurrently. So one connection can carry HTML, CSS, JS, and images together. Headers are compressed, reducing repeated metadata. This is called HPACK in HTTP/2. A stream diagram helps. ┌────────────┐ one TCP/TLS connection ┌────────────┐ │ Browser │════════════════════════════════▶│ Server │ │ stream 1 │ HTML │ │ │ stream 3 │ CSS │ │ │ stream 5 │ JS │ │ │ stream 7 │ image.jpg │ │ └────────────┘◀════════════════════════════════└────────────┘ Multiplexing removes much connection setup overhead. But HTTP/2 still rides on TCP. If one TCP packet is lost, all later bytes wait. So transport-level head-of-line blocking remains below HTTP/2. Server push tried to send useful resources before explicit requests. Example: send CSS immediately after HTML response begins. That sounded smart, but many deployments mispredicted. Browsers often already had the asset cached. Extra pushed bytes sometimes wasted bandwidth. So server push exists historically, but is less celebrated operationally. Priority signals were another important HTTP/2 idea. Browsers could ask for CSS before offscreen images. Good prioritization improved visual progress. Bad prioritization still wasted the shared connection. Header compression helped authenticated apps especially. A 900-byte cookie header repeating twenty times is expensive. Compression shrinks that repeated metadata materially.

3) HTTP/3 moves web traffic onto QUIC over UDP

HTTP/3 keeps HTTP semantics again. But it rides on QUIC, not TCP. QUIC runs over UDP and implements reliability in user space. This allows stream-level independence above the packet layer. If one stream loses data, others can continue more smoothly. Handshake behavior also improves. QUIC can combine transport and crypto setup tightly. Connection migration is another useful feature. A phone switching from Wi-Fi to mobile data may keep the session. That is harder with classic TCP connections. A comparison sketch helps. ┌────────────┐ UDP packets carrying QUIC ┌────────────┐ │ Browser │═════════════════════════════════════▶│ Server │ │ stream 1 │ HTML │ │ │ stream 3 │ CSS │ │ │ stream 5 │ API │ │ └────────────┘◀════════════════════════════════════└────────────┘ Because QUIC encrypts most transport metadata, middleboxes see less and interfere less. That can improve behavior on messy networks. But operational tooling may need updating. Packet captures and load balancers must understand new patterns. QUIC also acknowledges packets more flexibly than classic TCP. Loss recovery happens with transport knowledge in user space. That lets implementations improve without kernel release cycles. Large platforms like that operational agility. Mobile apps like faster recovery after brief network blips.

4) Compare the generations with concrete numbers

Assume one page needs one HTML file and five assets. Assume RTT is 40 ms and server work is 30 ms. HTTP/1.1 simplified model: Six connections. Each cold connection setup plus TLS: about 80 ms. Parallelism helps somewhat, but repeated setup and blocking hurt. Page visibly completes around 320-380 ms in this toy example. HTTP/2 model: One connection setup plus TLS: about 80 ms once. Six streams share it. Header compression saves bytes. Toy completion might drop near 220-260 ms. HTTP/3 model: One QUIC plus TLS-style setup: often around 60 ms. Loss on one stream hurts less across siblings. Toy completion might land near 180-230 ms on mobile networks. These are teaching numbers, not universal promises. Still, the trend is consistent. Newer versions reduce connection cost and blocking. They shine most when RTT is high or loss exists. A worked storefront example makes the difference tangible. Home page needs HTML, app shell, CSS, fonts, and two APIs. On hotel Wi-Fi, one packet drops during CSS delivery. HTTP/1.1 may stall one connection and force others to wait for reuse. HTTP/2 keeps one connection, but TCP loss still freezes sibling streams. HTTP/3 can limit the pain more cleanly across streams. The user does not care about protocol beauty. The user notices whether the page feels sticky or smooth.

5) Choosing and explaining versions in real systems

HTTP/1.1 still remains everywhere for compatibility and simple services. HTTP/2 is a strong default for browsers today. HTTP/3 is increasingly valuable on mobile and lossy links. Do not pitch versions like fashion labels. Explain them using connection count, framing, multiplexing, and loss behavior. Also separate browser-to-edge from edge-to-origin paths. A company may use HTTP/3 at the edge, and still use HTTP/1.1 or HTTP/2 internally. The sealed envelope story continues too. HTTP/1.1, HTTP/2, and HTTP/3 usually run under HTTPS now. So envelope generation and encryption generation are related decisions. The post office still forwards packets blindly through every version. Also remember adoption is asymmetric. Browsers may speak HTTP/3 to the CDN edge. The CDN may downgrade to HTTP/2 toward internal services. Legacy bots may still use HTTP/1.1 only. That mixed reality is completely normal. System design answers should mention client mix and observability. If your logs, proxies, or firewalls cannot explain the traffic, a theoretical performance win may not survive production. Protocol rollout is a product decision, not only an infra decision. Measure before and after by geography and device class. That is how teams justify migration effort.


Where this lives in the wild

  1. YouTube web performance engineer prefers HTTP/2 or HTTP/3 for many assets. Multiplexing reduces delays on media-heavy pages.
  2. Cloudflare edge engineer enables HTTP/3 for mobile users on lossy networks. QUIC often recovers more gracefully than TCP-based delivery.
  3. GitHub frontend engineer benefits from HTTP/2 header compression and fewer connections. Repeated authenticated headers become cheaper on busy dashboards.
  4. Uber mobile platform engineer values QUIC connection migration during network changes. Riders move between Wi-Fi and cellular constantly.
  5. Shopify storefront engineer still keeps HTTP/1.1 compatibility for diverse clients. Real systems rarely switch every consumer at once.

Pause and recall

  1. Why does HTTP/2 still suffer when the underlying TCP connection loses packets?
  2. What problem does multiplexing solve compared with HTTP/1.1?
  3. Why can HTTP/3 behave better on changing mobile networks?
  4. Why did server push not become the universal win people expected?

Interview Q&A

Q1. What is the biggest practical difference between HTTP/1.1 and HTTP/2? HTTP/2 multiplexes many streams over one connection with binary framing. HTTP/1.1 commonly needed several connections and repeated overhead. Common wrong answer to avoid: "HTTP/2 is just HTTP/1.1 with gzip enabled." Q2. Why does HTTP/3 use UDP underneath? QUIC wants transport control in user space with independent streams. That reduces some TCP-level blocking and improves mobility features. Common wrong answer to avoid: "UDP is always faster because it has zero loss." Q3. Does HTTP/2 remove all head-of-line blocking? It removes much application-layer blocking between requests. But TCP packet loss can still stall all streams underneath. Common wrong answer to avoid: "Multiplexing means packet loss never matters again." Q4. When might HTTP/1.1 still be acceptable? Simple APIs, older clients, and compatibility-heavy environments may keep it. Performance needs and client mix decide the answer. Common wrong answer to avoid: "HTTP/1.1 is obsolete and should never be used."


Apply now (5 min)

Pick one page you use often. Estimate how many separate resources it fetches initially. Now compare startup cost under six HTTP/1.1 connections versus one HTTP/2 connection. Then imagine one 2 percent packet loss mobile link. Which version likely feels steadier? Sketch from memory Draw HTTP/1.1 as several separate pipes. Draw HTTP/2 as many numbered streams inside one pipe. Draw HTTP/3 as QUIC streams over UDP. Write one performance win beside each newer version.


Bridge. We now know how requests travel and which wire format carries them. Next we compare API styles that sit above HTTP: REST, gRPC, and GraphQL. → 06-rest-grpc-graphql-protocols.md