Choosing v4 UUIDs vs ULIDs vs nanoid
Three identifier formats, three different bets on what matters. Pick the wrong one and the cost shows up months later as either ugly URLs or a slow database.
There are three identifier formats that show up in modern code: UUIDs (specifically v4), ULIDs, and nanoids. They look similar enough that most teams pick one based on whichever library was already imported. They are meaningfully different, and using the wrong one usually shows up later as either a bizarre database performance regression or a URL that is much uglier than it needed to be.
v4 UUID: the unconditional default
A UUID v4 is a 128-bit identifier where 122 of the bits are random and 6 are reserved to encode the version (4) and variant. It is rendered as 36 hexadecimal characters with four dashes in fixed positions:
f47ac10b-58cc-4372-a567-0e02b2c3d479
^^^^^^^^ ^^^^ ^^^^ ^^^^ ^^^^^^^^^^^^
8 4 4 4 12
| |
v +-- variant bits (10xx in first nibble)
version (always 4 for v4)The collision math is reassuring. With 122 random bits, you would need to generate roughly 2.71 × 1018 UUIDs before the probability of a single collision crossed 50% (the birthday-paradox threshold). Even at a billion UUIDs per second, that is about 86 years before the math gets nervous. For practical purposes, a v4 UUID is unique enough.
Strengths: universal support, opaque to anyone looking at it (no information leakage about creation time or sequence), trivially generated client-side via crypto.randomUUID(), and accepted as a native column type in PostgreSQL, MySQL 8, and SQL Server.
Weaknesses: the dashes are ugly in URLs, the format is verbose at 36 characters, and — most importantly — the random distribution scatters inserts across the whole index space, which has real cost on write-heavy workloads.
ULID: sortable by time, still random enough
A ULID is also 128 bits, but the layout is different. The first 48 bits are the Unix timestamp in milliseconds; the remaining 80 bits are random.
01ARZ3NDEKTSV4RRFFQ69G5FAV
^^^^^^^^^^ ^^^^^^^^^^^^^^^^
10 16
| |
timestamp randomness
(48 bits) (80 bits)
encoded with Crockford Base32, 26 chars totalTwo things follow from that layout. First, ULIDs sort lexicographically into chronological order — the string "01ARZ..." sorts before "01BAB..." because the time-encoded prefix runs left-to-right. Second, the 80 random bits are still enough to make collisions effectively impossible within the same millisecond. The spec also defines a monotonic mode that increments the random component when two IDs land in the same millisecond, which guarantees strict ordering even at very high generation rates.
The Crockford Base32 encoding intentionally drops the characters I, L, O, and U to avoid the most common transcription mistakes when humans have to read or type one. That is a small detail that comes up in practice when anyone has to dictate an ID over a phone call.
Strengths: lexicographic time-order is free, encoded form is shorter than a UUID (26 vs 36 chars), monotonic within a millisecond, dramatically friendlier to B-tree indexes (more on this below).
Weaknesses: the timestamp is observable to anyone who decodes the ID. That is sometimes a feature (audit logs, ordering) and sometimes a leak (an invite code that reveals when it was minted). Also, ecosystem support is thinner than UUID: no native database column type, no native browser API.
nanoid: built for URLs and short codes
nanoid is a different bet. It is not a fixed-length, fixed-alphabet spec — it is a generator parameterized by length and alphabet. The defaults give you 21 characters from a 64-character URL-safe set (A-Z, a-z, 0-9, -, _):
import { nanoid } from "nanoid";
nanoid(); // 21 chars, ~126 bits of entropy
nanoid(10); // 10 chars, ~60 bits
nanoid(8); // 8 chars, ~48 bits — short codes only
import { customAlphabet } from "nanoid";
const numId = customAlphabet("0123456789", 6);
numId(); // "739204" — 6-digit numeric codeAt 21 chars × 64-symbol alphabet, you get roughly 126 bits of entropy, very close to UUID v4's 122. So the default is a UUID-grade ID with shorter, prettier output. The customizable flavor is what makes nanoid the right pick for invite codes, share slugs, and short URLs where you intentionally trade entropy for readability.
Strengths: compact, URL-safe, alphabet-flexible, explicit entropy budget. Tiny runtime footprint.
Weaknesses:no IETF spec, ecosystem support is weaker than UUID's, and the customizability is rope to hang yourself with — a 6-character ID has only ~36 bits of entropy and will collide much sooner than people expect at scale.
The B-tree performance angle
This is the part that is consistently undersold in posts comparing these formats. The choice of identifier has a real, measurable effect on database write throughput once your tables get large.
Most databases store rows in a B-tree (or a B+tree) clustered or secondary-indexed by the primary key. When you insert a new row, the engine finds the leaf page where the new key belongs and writes it there. If the keys are sequential — as they would be with an auto-increment integer — every insert lands on the same hot page near the end of the index. That page is already in cache, the disk write is amortized, and throughput stays high.
With random keys, every insert lands on a different page scattered across the entire index. Each page has to be loaded from disk (cold cache miss), modified, and written back. Page splits become frequent, fragmentation grows, and the working set of pages the database needs to keep hot grows linearly with the index size. On a sufficiently large table on rotational disk, the difference between sequential and random key inserts can be one or two orders of magnitude.
Random UUID v4s are the worst case. ULIDs — because they prefix the timestamp — produce almost-sequentialinserts within any short window, which behaves much more like an auto-increment for the database's purposes. nanoid is between the two; it depends on the alphabet, but a default 21-character random nanoid is functionally as random as UUID v4 from an index-locality standpoint.
When this actually matters
The decision tree
A short version that handles most practical cases:
- An opaque database primary key, no other constraints: UUID v4. The ecosystem support is worth the slight performance tax.
- Same as above, but the table will get very large or write-heavy: ULID. Prefix-sortable inserts cluster on hot pages.
- Same as above, but you need lexicographic creation-time ordering without a separate timestamp column: ULID. That is what it was designed for.
- An ID that goes in a URL, share link, or invite code, and you control the length: nanoid. You get to pick the entropy budget.
- A short human-typeable code (offline activation, gift card, six-digit OTP): nanoid with a custom restricted alphabet. Collision math is on you, but it is the right tool.
- An ID that must be opaque to the user (does not leak creation time or sequence): UUID v4 or nanoid. Not ULID — the timestamp is recoverable from the prefix.
What Persimmon's UUID generator does
The generator on this site outputs v4 UUIDs using the cryptographically-secure source the browser exposes — crypto.randomUUID() when available, and a crypto.getRandomValues()-based fallback otherwise. The output is the canonical 36-character dashed representation, which is what 95% of people who land on the page actually need.
ULID and nanoid generators are good candidates for the next round of tools to add. If either would help you and is not on the site yet, the contact page is the fastest way to make it land in the roadmap.
Related tools
Keep reading
Why every PDF tool on Persimmon runs in your browser
When you upload a PDF to a stranger's server, you're handing them every word, signature, and number on the page. Here's why we refused to build it that way.
DesignThe math behind WCAG contrast checking
Why “4.5:1” is not a slider value, why averaging RGB does not work, and what the new APCA model fixes.