[COMPARE] Base64 vs Base64URL vs Base32
Three encodings, one RFC. Here's what actually differs — alphabet, overhead, readability, case-sensitivity — and how to pick the right one for URLs, filenames, emails, QR codes, and human-spoken codes.
// THE QUICK COMPARISON TABLE
Base64 Base64URL Base32
(RFC 4648 §4) (RFC 4648 §5) (RFC 4648 §6)
────────────────────────────────────────────────────────────────
Alphabet size 64 64 32
Bits per char 6 6 5
Characters used A-Z a-z 0-9 A-Z a-z 0-9 A-Z 2-7
+ / - _ (+ = for pad)
Padding = (required) = (optional) = (optional)
Case-sensitive Yes Yes No (A = a)
Size overhead ~33% ~33% ~60%
URL-safe No Yes Yes
Filename-safe No Yes Yes
DNS-safe No Yes Yes
Human-readable Medium Medium High
Human-spoken Painful Painful Doable
QR-code friendly OK OK Excellent
(alphanumeric) (alphanumeric) (alphanumeric mode)
// WHAT THEY HAVE IN COMMON
All three are defined in RFC 4648 — a single 2006 document that unified the family of "base-N binary-to-text encodings." They share the same underlying algorithm: take binary input, re-group the bits into chunks of log₂(alphabet size), map each chunk to a character, pad to align on the output word boundary.
• Base64: 6-bit chunks → 4 characters per 3 bytes (4/3 ratio, ~33% larger)
• Base32: 5-bit chunks → 8 characters per 5 bytes (8/5 ratio, ~60% larger)
• Base16 (hex): 4-bit chunks → 2 characters per 1 byte (2/1 ratio, 100% larger)
All three are reversible, deterministic, and preserve every byte of input. They are encodings, not compression and not encryption. Anyone with the string can decode it back.
// WHEN TO PICK BASE64 (STANDARD, §4)
Standard Base64 is the default for any text channel that is not a URL. MIME email bodies, PEM-wrapped certificates, HTTP Basic Auth, S/MIME, XML-DSIG, data URIs, and most file-format binary-in-text containers use it. The characters + / = are all safe inside quoted attribute values, inside email headers (when line-wrapped), and inside XML/JSON strings.
Use standard Base64 when the downstream consumer is a text protocol that accepts arbitrary printable ASCII — and especially when RFC-level specs explicitly call it out (MIME: RFC 2045; PEM: RFC 7468; HTTP Basic Auth: RFC 7617; data URI: RFC 2397).
- > Email attachments (MIME base64 transfer encoding)
- > PEM certificates, keys, CRLs (-----BEGIN ... -----)
- > HTTP Basic Authentication header value
- > Data URIs: data:image/png;base64,...
- > S/MIME and XML-ENC payloads
- > Classic SOAP / XML-DSIG signatures
- > Any binary field inside a JSON document that is not then put in a URL
// WHEN TO PICK BASE64URL (§5)
The moment your Base64 output touches a URL, a cookie value, a filename, a DNS label, or anywhere + / = would need percent-encoding, switch to Base64URL. It uses the same 64-character alphabet with two swaps (+ → -, / → _) and conventionally drops padding.
- > JWT tokens — header, payload, signature are all base64url (RFC 7515)
- > OAuth 2.0 PKCE code_challenge (RFC 7636)
- > OpenID Connect state and nonce parameters
- > Magic links, invite tokens, password-reset tokens
- > Cookie values that may be copied into URLs
- > Filenames derived from hashes — avoid / in filenames
- > DNS labels, TXT records — hyphens allowed, slashes not
- > Content-addressable storage keys
- > Short URL identifiers generated from random bytes
// WHEN TO PICK BASE32 (§6)
Base32 uses only 32 characters: uppercase A–Z and digits 2–7. That is ~60% size overhead — much worse than Base64 — but you gain three very specific properties that Base64 cannot match.
Case-insensitivity. The alphabet is uppercase-only. Humans reading or typing the string can ignore case; JBSWY3DP and jbswy3dp decode identically.
Disambiguated characters. The digits 0, 1, 8, and 9 are excluded because they look like O, I, B, and g in common fonts. Only 2–7 are used. This makes human transcription from a printed page or a phone screen much more reliable.
Alphanumeric-mode QR code compatibility. QR codes have a special "alphanumeric" mode that encodes 5.5 bits per character using a subset of ASCII. Base32 fits entirely inside that subset (plus padding), so a QR-encoded Base32 string is significantly smaller than a QR-encoded Base64 string.
- > TOTP / HOTP secret seeds — Google Authenticator, 1Password, Authy all use Base32
- > Tor .onion v3 addresses — 56-character Base32 encodings of ed25519 keys
- > BitTorrent info-hashes shared as magnet links
- > Geohash-like human-sharable location codes
- > License keys and product serial numbers
- > Printed recovery codes (2FA backup codes, wallet mnemonics)
- > DTMF-like transcription over voice channels
- > Systems that need case-insensitive storage (DNS labels)
// SIZE OVERHEAD — A CONCRETE EXAMPLE
// Input: a 32-byte SHA-256 hash
// Raw: 0x89abcdef… (32 bytes, binary — can't put in text)
// Hex (Base16): 40b2e2… (64 chars, 100% overhead)
// Base32: 5ENM4H2TQWMZ3O4OQBJAFY5Q (56 chars, 75% overhead)
// Base64: ia+N7/eZtRsPj5TqFoqUlD… (44 chars, 37% overhead)
// Base64URL: ia-N7_eZtRsPj5TqFoqUlD… (43 chars, 34% overhead, no padding)
// Input: a 16-byte UUID
// Hex: e7a6c1d0-4b7d-4c6c-8e2f-9f1a3e4b5c6d (36 chars incl. dashes)
// Base32: 5WTMDUCLPVGGZDRPTF… (26 chars)
// Base64: 56bB0Et9TGyOL58aPktcbQ== (24 chars)
// Base64URL: 56bB0Et9TGyOL58aPktcbQ (22 chars, no padding)
// HUMAN-READABILITY BENCHMARK
This is where Base32 shines. Try reading these aloud:
• a+b/c1D2e3F/+g= — Base64. "lowercase-a, plus, lowercase-b, forward-slash, lowercase-c, one, uppercase-D, two, lowercase-e, three, uppercase-F, forward-slash, plus, lowercase-g, equals." Prone to transcription errors on case alone.
• JBSWY3DPEHPK3PXP — Base32. "J-B-S-W-Y, three, D-P-E-H-P-K, three, P-X-P." Case doesn't matter. No 0/O or 1/l ambiguity. You can read this over the phone with high confidence.
This is exactly why TOTP uses Base32: someone has to type the seed into their authenticator app from a QR code fallback screen. Base64 case-sensitivity would generate endless support tickets.
// QR CODE SIZE COMPARISON
QR codes have a special "alphanumeric mode" that packs 5.5 bits per character using a 45-character subset: 0–9, A–Z (uppercase only), space, and $ % * + - . / :. Anything outside this subset forces the QR code into "byte mode" which uses 8 bits per character — significantly larger QR codes.
Base32 is entirely inside the alphanumeric subset. Base64 is not — lowercase letters and +/= force byte mode (well, +/ are in the alphanumeric set, but any lowercase forces byte mode). This means a given payload fits into a smaller QR code when encoded as Base32 — often the difference between a readable 21×21 QR and a crowded 33×33 one.
// DECODING GOTCHAS
-
>
Case-sensitivity — Base64 decoders reject mis-cased input (
SGVsbG8=≠sgvSbg8=). Base32 decoders typically normalise to uppercase before lookup, so they tolerate any case. -
>
Padding — standard Base64 requires
=padding; JWT/base64url forbids it; Base32 makes it optional (RFC 4648 §6). Always check what your decoder expects. -
>
Whitespace — MIME Base64 is line-wrapped at 76 columns with CR-LF. Many decoders tolerate whitespace, some don't. Strip before passing to
atob()or its equivalent. -
>
Alphabet collisions — feeding Base64 into a Base32 decoder (or vice versa) looks like it might work at first —
A,B,Care valid in both — and then fails silently when it hits+or=. - > Crockford Base32 is a non-RFC variant used by Stripe IDs and some blockchain systems. It uses a different alphabet (excluding I, L, O, U) and supports check digits. Don't confuse it with RFC 4648 Base32.
- > Base32 Extended Hex (RFC 4648 §7) — an alternative Base32 alphabet that preserves lexicographic sort order. Used in DNSSEC NSEC3 records. Easy to mix up with standard Base32.
// A DECISION FLOWCHART
Does the output go into a URL, cookie, filename, DNS, JWT, or OAuth flow?
│
├─ Yes → Does a human need to type or speak the string?
│ │
│ ├─ Yes → Base32 (uppercase, no ambiguous chars)
│ │ e.g., TOTP seeds, recovery codes
│ │
│ └─ No → Base64URL (more compact, URL-safe)
│ e.g., JWT, short tokens, hash-named files
│
└─ No → Does the output go into a QR code that must stay tiny?
│
├─ Yes → Base32 (fits in QR alphanumeric mode)
│
└─ No → Standard Base64
e.g., email MIME, PEM, HTTP Basic, data URI
// WHY NOT BASE16 (HEX)?
Hexadecimal (RFC 4648 §8 calls it Base16) is the universal fallback. Every shell tool, debugger, and protocol understands it. It's case-insensitive, trivially human-readable, and trivial to implement. But it doubles the size of your payload (100% overhead), which is why it is only used for small fixed-length identifiers: SHA-256 hashes (64 hex chars = 32 bytes), UUIDs (32 hex chars = 16 bytes), MAC addresses, memory addresses in debuggers.
For anything larger than a few dozen bytes, the size cost of hex is genuinely painful over the wire. Base64 is 50% smaller than hex, Base32 is 25% smaller. That is why Base64 won the email and web-embed use cases, while hex stayed in debugging and hash-display scenarios.
// WHAT ABOUT BASE58 / BASE62 / BASE85?
Non-RFC base-N encodings exist for specific niches:
• Base58 — Bitcoin addresses, Flickr photo IDs. Excludes the four ambiguous characters (0, O, I, l) and +/ for easier human transcription. ~37% overhead.
• Base62 — URL shorteners, Twitter snowflake IDs. Uses only alphanumerics (no special chars), safe in URLs without escaping. ~34% overhead.
• Base85 / Ascii85 / Z85 — PostScript, PDF, older ZeroMQ frames. ~25% overhead (denser than Base64) but with tricky character choices that can cause XML/JSON escaping hell.
These are interesting, but they are not RFC 4648 standardised. If you are shipping a public protocol, Base64 or Base64URL is almost always the safer default because every language's standard library already supports them. If you pick Base58, you'll be shipping a dependency to every consumer.
// TRY THEM SIDE BY SIDE
• Base64 encoder — standard alphabet with a URL-safe toggle
• Base64URL encoder — URL-safe with padding stripped
• Base32 encoder — RFC 4648 §6 alphabet
• Base16 (hex) encoder — for comparison
• Base58 encoder — Bitcoin-style human-friendly alphabet
Further reading:
• What is Base64 and how does it work?
• URL-safe vs standard Base64 — the full story
• Base64 for UTF-8 and Unicode