Long Polling

Long polling — the smarter naive fix

Long polling eliminates the empty responses problem by making the server wait before responding. No message? Don't respond yet — hold the connection open until something arrives. It sounds clever. The numbers show where it falls apart.

How it works¶

Instead of the server immediately responding with "nothing", it holds the connection open and waits. The moment a message arrives, it responds. The client immediately opens a new connection and waits again.

t=0s:   Client → "GET /messages" → Server ... (server holds, waiting)
t=0s:   Server is holding 20M connections open, all waiting
t=4s:   Alice sends Bob a message
t=4s:   Server → "1 new message" → Bob's client  (connection closes)
t=4s:   Bob's client immediately opens a new connection → Server ... (waiting again)

No more empty responses. The server only responds when there's something to say. Latency drops — the message arrives the moment it's available, not on the next poll cycle.

The math — connections¶

Assumptions (80/20 rule applied twice):

MAU                   → 500M
DAU                   → 20% of MAU  = 100M
Concurrent online     → 20% of DAU  = 20M

Every online user holds one open connection waiting for messages:

Persistent connections held → 20M
Capacity per server (async) → 100k concurrent connections
Connection servers needed   → 20M / 100k = 200 servers

200 servers just to hold open connections doing nothing. Expensive, but manageable — this is the same cost you'd pay with WebSockets. So far long polling isn't worse.

Where it breaks — the reconnection tax¶

Here's the real problem. Every time a message arrives, the connection closes and the client must reconnect. That reconnection pays the full handshake cost:

TCP handshake  → ~30ms
TLS handshake  → ~60ms
HTTP request   → ~10ms
Total          → ~100ms per reconnection

Write QPS is 10k messages/sec. Each message delivery closes one connection and triggers one reconnection:

Reconnections/sec       → 10k/sec
Cost per reconnection   → ~100ms of latency added

Your 200ms end-to-end SLO now looks like this:

Network transit (send)     → ~10ms
Server processing          → ~10ms
Reconnection overhead      → ~100ms
Network transit (receive)  → ~10ms
Total                      → ~130ms

You're burning 100ms — half your entire latency budget — on reconnection overhead that produces zero value. And this is the average case. Under load, TLS handshakes slow down. At peak 20k messages/sec, you have 20k reconnections happening simultaneously, all competing for server resources.

The send path problem¶

Long polling only solves receiving. To send a message, the user still fires a separate HTTP POST — a brand new connection with its own handshake:

User sends a message:
  → New TCP connection    → ~30ms
  → New TLS handshake     → ~60ms
  → HTTP POST             → ~10ms
  Total                   → ~100ms just to initiate the send

So every user maintains two connection paths:

Receive path → 1 persistent long-poll connection (waiting)
Send path    → new HTTP connection per message sent (~100ms overhead each)

Two separate code paths. Two connection pools. And still paying 100ms per send.

Verdict¶

Long polling is rejected. It fixes the empty response problem of short polling, but introduces a reconnection tax of ~100ms on every message delivery. Combined with a separate send path that costs another ~100ms per message, you've burned your entire latency budget before the message even reaches its destination.

When long polling is acceptable

Long polling is fine for low-frequency updates — think GitHub showing "this PR was updated" or a dashboard refreshing every 30 seconds. When messages are rare and latency doesn't matter much, the reconnection cost is paid infrequently. It's only a disaster when messages are frequent and latency is tight.