DB Shard Primary Down
A DB shard primary goes down
With 8 shards each holding a primary and 2 secondaries, losing one primary is not a full system failure. But the impact is precise and worth understanding — reads, writes, and RYOW are all affected differently.
What happens immediately¶
Shard 3 primary dies. The system detects it via health checks. Leader election begins — one of the two secondaries gets promoted to primary.
This process takes 30-60 seconds in a typical Postgres setup with Patroni or a similar HA manager. During that window, shard 3 has no writable node.
Reads — mostly unaffected¶
This is the counterintuitive part. Most people assume losing a primary kills reads too. It doesn't.
Shard 3 had 1 primary + 2 secondaries. The primary died. The 2 secondaries are still running.
Redirect request → Redis cache → HIT → return 301 ✓ (never touched DB)
Redirect request → Redis cache → MISS → shard 3 needed
→ Primary dead, but secondaries alive
→ Route to secondary → served ✓
The vast majority of redirects never hit the DB at all — Redis absorbs 80%+. For the remainder that miss cache and need shard 3, the secondaries are still available and serving reads. No impact on redirects.
Writes — fail for 60 seconds¶
Creation requests use consistent hashing to route to a specific shard. ~1/8 of all creations route to shard 3.
Creation request → consistent hashing → shard 3
→ Must write to primary (writes always go to primary)
→ Primary is dead
→ New primary not yet elected (30-60s window)
→ Write fails → user gets 500 error
During the 60-second failover window, ~1/8 of all URL creation requests fail. The other 7/8 shards are unaffected — their primaries are healthy.
Once the new primary is elected, writes to shard 3 resume normally.
RYOW — fails for a precise window of users¶
Read-Your-Own-Writes uses a ryow_until cookie. When the cookie is valid, the app server routes the read to the shard primary — not a secondary — to guarantee the user sees their freshly created URL.
This becomes a problem when the primary dies:
T=0s User creates short URL → written to shard 3 primary
App server sets cookie: ryow_until = now + 30s
T=10s Shard 3 primary crashes
T=15s User clicks their new short URL
App server reads ryow_until cookie → still valid
Consistent hashing → shard 3 → go to primary
Primary is dead → request fails → 500 error
The user created their URL, got the short code, and now it doesn't work. From their perspective the system is broken.
But the impact is narrow:
Users affected = created a URL on shard 3
within the 30-second ryow_until window
before the crash
After 30 seconds, the cookie expires. Those same requests route to secondaries and work fine. Only the unlucky users in that specific 30-second window on shard 3 experience the failure.
Complete impact summary¶
Redirects (Redis hit) → unaffected ✓
Redirects (cache miss) → secondaries serve them ✓
New creations → ~1/8 fail during 60s failover window
RYOW reads → fail for users who created in last 30s on shard 3
1/8 of the system is partially degraded for ~60 seconds. 7/8 of the system is completely unaffected. This is the value of sharding — failures are contained to a single shard.
After failover completes¶
T=60s New primary elected on shard 3
T=60s etcd updated: shard-3/primary → new IP
T=60s App servers read new topology from etcd
T=60s Writes to shard 3 resume
T=60s RYOW routes to new primary correctly
T=60s Full system restored
The only lasting concern: the new primary was a secondary that may have been slightly behind on replication. Any writes that were in-flight to the old primary when it crashed may be lost — this is the replication lag window. In practice, with synchronous replication or small async lag, this is a very small number of writes, potentially zero.
Common misconception
Losing a primary does not kill reads for that shard. Secondaries are still alive and still serve read traffic. The real damage is to writes (no primary to write to) and RYOW (cookie forces read to primary which is dead).
Interview framing
"Shard primary dies → 30-60s failover window. Reads unaffected — secondaries still serve cache misses. Writes fail for ~1/8 of creations during the window — those hashing to that shard. RYOW fails for users who created a URL on that shard within the last 30 seconds — their cookie says go to primary, primary is dead. After failover, etcd updates topology, app servers pick up new primary IP, full recovery."