UUID Base64 Trim

Building on the previous

Random generation degrades as the DB fills up — too many collision retries at high fill rates. The root cause is that randomness has no coordination — two servers can independently generate the same code. What if we used a UUID — a standard that guarantees global uniqueness without any coordination?

What is a UUID?¶

UUID (Universally Unique Identifier) is a 128-bit number generated using a combination of timestamp, machine identifier, and randomness. The standard is designed so that any machine anywhere can generate a UUID and it will not collide with any UUID generated by any other machine — ever.

Example UUID → 550e8400-e29b-41d4-a716-446655440000

No DB check needed. No coordination between servers. Just generate and use.

The length problem — first, why not base16?¶

A UUID is 128 bits. The simplest encoding is base16 (hex) — the same format UUIDs are usually displayed in.

Step 1 — how many bits per character in base16?

Base16 has 16 possible characters per position (0-9, a-f). To represent 16 different values, you need exactly enough bits so that 2^bits = 16:

2^1 = 2    → not enough
2^2 = 4    → not enough
2^3 = 8    → not enough
2^4 = 16   ✓

So 1 base16 character = 4 bits

Step 2 — how many characters to encode a 128-bit UUID?

UUID size          = 128 bits
Bits per character = 4  (base16)
Characters needed  = 128 / 4 = 32 characters

That's the raw hex UUID you already know — 550e8400-e29b-41d4-a716-446655440000 — 32 characters plus dashes. Completely unusable as a short code.

The rule: higher base = more bits packed per character = fewer characters needed. Base16 is too low. We need to go higher.

Moving to base64¶

Step 1 — how many bits per character in base64?

Base64 has 64 possible characters per position. Same logic:

2^1 = 2    → not enough
2^2 = 4    → not enough
2^3 = 8    → not enough
2^4 = 16   → not enough
2^5 = 32   → not enough
2^6 = 64   ✓

So 1 base64 character = 6 bits

Step 2 — how many characters to encode a 128-bit UUID?

UUID size          = 128 bits
Bits per character = 6  (base64)
Characters needed  = 128 / 6 = 21.3 → round up to 22 characters

So the full base64 encoding of a UUID is 22 characters long. That is not a short URL — that's longer than most actual paths.

bit.ly/550e8400e29b41d4a716446655440000  ← 32 chars hex, horrible UX
bit.ly/VQ6EAOKbQdSnFkRmVUQA              ← 22 chars base64, still too long

The temptation — trim to 6 characters¶

You already have the uniqueness from the UUID. Can you just take the first 6 characters of the base64 encoding?

UUID base64 → VQ6EAOKbQdSnFkRmVUQA
Trimmed     → VQ6EAO
Short URL   → bit.ly/VQ6EAO

No. This is exactly the same mistake as trimming a hash.

The UUID's uniqueness comes from all 128 bits working together. The first 6 characters of the base64 encoding represent only 36 bits (6 chars × 6 bits). Two different UUIDs can easily share the same first 36 bits:

UUID 1 → VQ6EAOKbQdSnFkRmVUQA  → trimmed → VQ6EAO
UUID 2 → VQ6EAO1x9pTrWzHqMnLB  → trimmed → VQ6EAO  ← collision ✗

You have thrown away the bits that made them different. Trimming a UUID breaks uniqueness just as surely as trimming a hash.

The fundamental tension¶

UUID gives you uniqueness across distributed systems — but at 22 characters, it's too long. Trimming gets you to 6 characters — but destroys the uniqueness guarantee.

Full UUID encoding  → unique ✓ / too long ✗
Trimmed UUID        → short ✓ / not unique ✗

You cannot have both by trimming. You need a different kind of ID — one that is both short enough and uniquely structured.

Why this fails

Trimming a UUID discards bits and breaks the uniqueness guarantee. The same collision problem from approach 2 (hashing + trim) reappears here. Trimming any large unique identifier down to a short code always causes this — you cannot trim your way to uniqueness.

Next: UUID is 128 bits. That's more than we need. From the estimation, 36 bits is enough to cover 50 billion URLs. What if we used a 64-bit ID instead — one designed specifically for distributed systems? That's Snowflake.