Skip to content

UUID Snowflake Base62

Building on the previous

Snowflake + base64 gives 11 characters and has URL-safety issues. The fix for URL safety is base62 — drop the two problematic characters. But does switching to base62 get us closer to 6-7 characters?


What is base62?

Base62 is base64 with + and / removed:

a-z  → 26 characters
A-Z  → 26 characters
0-9  → 10 characters
-----------
total → 62 characters   (no + or /)

Every character in base62 is safe to use directly in a URL — no percent-encoding needed. This is why base62 is the industry standard for URL shorteners.


The math — does it get shorter?

Step 1 — how many bits per base62 character?

Base62 has 62 possible values per character. 62 doesn't land exactly on a power of 2, so we use logarithms:

We need: 2^x = 62
x = log2(62) = log(62) / log(2) = 1.792 / 0.301 ≈ 5.95 bits per character

Compare to base64:

base64 → 2^6 = 64  → exactly 6 bits per character
base62 → 2^5.95    → ≈ 5.95 bits per character  (slightly less)

The difference is tiny — base62 packs just barely less information per character than base64.

Step 2 — how many characters to encode a Snowflake ID (64 bits)?

Total bits         = 64
Bits per character = 5.95  (base62)
Characters needed  = 64 / 5.95 = 10.75 → round up to 11 characters

Step 3 — how many characters to encode a UUID (128 bits)?

Total bits         = 128
Bits per character = 5.95  (base62)
Characters needed  = 128 / 5.95 = 21.5 → round up to 22 characters

Switching from base64 to base62 does not meaningfully change the length. Both give 11 characters for a Snowflake ID and 22 characters for a UUID.

Snowflake + base64  → 11 chars (URL-unsafe)
Snowflake + base62  → 11 chars (URL-safe) ✓
UUID + base64       → 22 chars (URL-unsafe)
UUID + base62       → 22 chars (URL-safe)

Base62 solves the URL-safety problem. The length stays the same.


Where we stand

We have a uniqueness-guaranteed, URL-safe short code — but it's 11 characters instead of the 6-7 we want.

The reason: Snowflake IDs are 64 bits, but we only need 36 bits to cover our entire 10-year URL space.

2^36 = 68 billion  >  50 billion (our 10-year estimate)  ✓
36 bits / 5.95 bits per char = ~6 characters  ✓

If we could use just 36 bits, we'd get 6 characters in base62. The natural next thought: take a Snowflake ID and use only the lower 36 bits. Trim the rest.

But there's a problem with that.


Base62 is the right encoding — but we still have a length problem

Base62 solves URL safety. For our 10-year scale, 36 bits (6 base62 chars) is theoretically enough. But we can't just take the lower 36 bits of a Snowflake ID — and the reason why is the most important concept in this entire section.


Next: Why can't we truncate the Snowflake ID to 36 bits and call it done?