Skip to content

Estimation

The goal of estimation

Estimation is not about getting exact numbers. It is about understanding the scale of the system so every design decision that follows is justified. A single machine or a thousand? Cache or no cache? One DB or sharded? Estimation answers all of that.


Assumptions — always state these out loud

Before touching a single number, state your assumptions explicitly. The interviewer needs to follow your reasoning.

MAU         → 100M users
DAU         → 30% of MAU are daily active = 30M DAU
URL creators → 30% of DAU create URLs = 10M users/day
URLs/user   → 3 URLs per user per day

Write QPS

URLs created per day = 10M users × 3 URLs = 90M/day
Seconds in a day     = 86,400 ≈ 10^5 (round up for easier math)

Write QPS = 90M / 100,000 = 900 writes/second ≈ 1k writes/second

Read QPS

URL shorteners are extremely read-heavy. A single viral link can be clicked millions of times. A ratio of 100x reads to writes is a reasonable assumption.

Read QPS = 1k × 100 = 100k reads/second (average)

Average vs peak

100k/sec is the average. URL shorteners have massive traffic spikes — a celebrity tweets a link and 10x traffic hits in seconds. Peak QPS can be 1M+/sec. This is why caching becomes critical — you cannot hit the database on every redirect at peak load.


Storage

Each URL entry stores:

Short URL code  →  ~50 bytes
Long URL        →  ~250 bytes  (average URL length)
ID + metadata   →  ~200 bytes  (timestamps, user info, expiry)

Total per entry →  ~500 bytes

Writes per day  = 90M entries/day
Writes per year = 90M × 365 ≈ 30B entries/year
Peak year       = ~50B entries (buffer for growth)

Storage per year = 50B × 500 bytes = 25,000 GB = 25TB/year
Storage for 10 years = 250TB

Common mistake

Do not confuse the number of records with the storage size. 50 billion records × 500 bytes = 25TB — not 50GB. Always multiply record count by record size.

250TB over 10 years cannot fit on a single machine. This tells you upfront that the database will need to be sharded. You don't design sharding now — but you flag it so the interviewer knows you see it coming.


Bandwidth

On every redirect, the system sends the long URL back over the network.

Read QPS        = 100k requests/second
Payload per req = ~300 bytes (long URL + headers)

Bandwidth = 100k × 300 bytes = 30 MB/s = 240 Mbps

240 Mbps is well within the range of modern infrastructure. Bandwidth is not a bottleneck for this system.


Summary

Metric Value
Write QPS ~1k/sec
Read QPS ~100k/sec (avg), ~1M/sec (peak)
Storage ~25TB/year
Storage (10 years) ~250TB
Bandwidth ~240 Mbps

Key implications: - Read-heavy → caching is essential - 250TB over 10 years → DB sharding required - Viral spikes → design must handle 10x peak load