[COMPARE] 10 min read

[COMPARE] Base64 vs Base64URL vs Base32

Three encodings, one RFC. Here's what actually differs — alphabet, overhead, readability, case-sensitivity — and how to pick the right one for URLs, filenames, emails, QR codes, and human-spoken codes.

April 2026 | comparison

// THE QUICK COMPARISON TABLE

                 Base64          Base64URL        Base32
                 (RFC 4648 §4)   (RFC 4648 §5)    (RFC 4648 §6)
────────────────────────────────────────────────────────────────
Alphabet size    64              64               32
Bits per char    6               6                5
Characters used  A-Z a-z 0-9     A-Z a-z 0-9      A-Z 2-7
                 + /             - _              (+ = for pad)
Padding          = (required)    = (optional)     = (optional)
Case-sensitive   Yes             Yes              No (A = a)
Size overhead    ~33%            ~33%             ~60%
URL-safe         No              Yes              Yes
Filename-safe    No              Yes              Yes
DNS-safe         No              Yes              Yes
Human-readable   Medium          Medium           High
Human-spoken     Painful         Painful          Doable
QR-code friendly OK              OK               Excellent
                 (alphanumeric)  (alphanumeric)   (alphanumeric mode)

// WHAT THEY HAVE IN COMMON

All three are defined in RFC 4648 — a single 2006 document that unified the family of "base-N binary-to-text encodings." They share the same underlying algorithm: take binary input, re-group the bits into chunks of log₂(alphabet size), map each chunk to a character, pad to align on the output word boundary.

• Base64: 6-bit chunks → 4 characters per 3 bytes (4/3 ratio, ~33% larger)
• Base32: 5-bit chunks → 8 characters per 5 bytes (8/5 ratio, ~60% larger)
• Base16 (hex): 4-bit chunks → 2 characters per 1 byte (2/1 ratio, 100% larger)

All three are reversible, deterministic, and preserve every byte of input. They are encodings, not compression and not encryption. Anyone with the string can decode it back.

// WHEN TO PICK BASE64 (STANDARD, §4)

Standard Base64 is the default for any text channel that is not a URL. MIME email bodies, PEM-wrapped certificates, HTTP Basic Auth, S/MIME, XML-DSIG, data URIs, and most file-format binary-in-text containers use it. The characters + / = are all safe inside quoted attribute values, inside email headers (when line-wrapped), and inside XML/JSON strings.

Use standard Base64 when the downstream consumer is a text protocol that accepts arbitrary printable ASCII — and especially when RFC-level specs explicitly call it out (MIME: RFC 2045; PEM: RFC 7468; HTTP Basic Auth: RFC 7617; data URI: RFC 2397).

  • > Email attachments (MIME base64 transfer encoding)
  • > PEM certificates, keys, CRLs (-----BEGIN ... -----)
  • > HTTP Basic Authentication header value
  • > Data URIs: data:image/png;base64,...
  • > S/MIME and XML-ENC payloads
  • > Classic SOAP / XML-DSIG signatures
  • > Any binary field inside a JSON document that is not then put in a URL

// WHEN TO PICK BASE64URL (§5)

The moment your Base64 output touches a URL, a cookie value, a filename, a DNS label, or anywhere + / = would need percent-encoding, switch to Base64URL. It uses the same 64-character alphabet with two swaps (+ → -, / → _) and conventionally drops padding.

  • > JWT tokens — header, payload, signature are all base64url (RFC 7515)
  • > OAuth 2.0 PKCE code_challenge (RFC 7636)
  • > OpenID Connect state and nonce parameters
  • > Magic links, invite tokens, password-reset tokens
  • > Cookie values that may be copied into URLs
  • > Filenames derived from hashes — avoid / in filenames
  • > DNS labels, TXT records — hyphens allowed, slashes not
  • > Content-addressable storage keys
  • > Short URL identifiers generated from random bytes

// WHEN TO PICK BASE32 (§6)

Base32 uses only 32 characters: uppercase A–Z and digits 2–7. That is ~60% size overhead — much worse than Base64 — but you gain three very specific properties that Base64 cannot match.

Case-insensitivity. The alphabet is uppercase-only. Humans reading or typing the string can ignore case; JBSWY3DP and jbswy3dp decode identically.

Disambiguated characters. The digits 0, 1, 8, and 9 are excluded because they look like O, I, B, and g in common fonts. Only 2–7 are used. This makes human transcription from a printed page or a phone screen much more reliable.

Alphanumeric-mode QR code compatibility. QR codes have a special "alphanumeric" mode that encodes 5.5 bits per character using a subset of ASCII. Base32 fits entirely inside that subset (plus padding), so a QR-encoded Base32 string is significantly smaller than a QR-encoded Base64 string.

  • > TOTP / HOTP secret seeds — Google Authenticator, 1Password, Authy all use Base32
  • > Tor .onion v3 addresses — 56-character Base32 encodings of ed25519 keys
  • > BitTorrent info-hashes shared as magnet links
  • > Geohash-like human-sharable location codes
  • > License keys and product serial numbers
  • > Printed recovery codes (2FA backup codes, wallet mnemonics)
  • > DTMF-like transcription over voice channels
  • > Systems that need case-insensitive storage (DNS labels)

// SIZE OVERHEAD — A CONCRETE EXAMPLE

// Input: a 32-byte SHA-256 hash
// Raw: 0x89abcdef…                             (32 bytes, binary — can't put in text)
// Hex (Base16): 40b2e2…                        (64 chars, 100% overhead)
// Base32:       5ENM4H2TQWMZ3O4OQBJAFY5Q       (56 chars, 75% overhead)
// Base64:       ia+N7/eZtRsPj5TqFoqUlD…        (44 chars, 37% overhead)
// Base64URL:    ia-N7_eZtRsPj5TqFoqUlD…        (43 chars, 34% overhead, no padding)

// Input: a 16-byte UUID
// Hex:       e7a6c1d0-4b7d-4c6c-8e2f-9f1a3e4b5c6d (36 chars incl. dashes)
// Base32:    5WTMDUCLPVGGZDRPTF…                  (26 chars)
// Base64:    56bB0Et9TGyOL58aPktcbQ==             (24 chars)
// Base64URL: 56bB0Et9TGyOL58aPktcbQ               (22 chars, no padding)

// HUMAN-READABILITY BENCHMARK

This is where Base32 shines. Try reading these aloud:

a+b/c1D2e3F/+g= — Base64. "lowercase-a, plus, lowercase-b, forward-slash, lowercase-c, one, uppercase-D, two, lowercase-e, three, uppercase-F, forward-slash, plus, lowercase-g, equals." Prone to transcription errors on case alone.

JBSWY3DPEHPK3PXP — Base32. "J-B-S-W-Y, three, D-P-E-H-P-K, three, P-X-P." Case doesn't matter. No 0/O or 1/l ambiguity. You can read this over the phone with high confidence.

This is exactly why TOTP uses Base32: someone has to type the seed into their authenticator app from a QR code fallback screen. Base64 case-sensitivity would generate endless support tickets.

// QR CODE SIZE COMPARISON

QR codes have a special "alphanumeric mode" that packs 5.5 bits per character using a 45-character subset: 0–9, A–Z (uppercase only), space, and $ % * + - . / :. Anything outside this subset forces the QR code into "byte mode" which uses 8 bits per character — significantly larger QR codes.

Base32 is entirely inside the alphanumeric subset. Base64 is not — lowercase letters and +/= force byte mode (well, +/ are in the alphanumeric set, but any lowercase forces byte mode). This means a given payload fits into a smaller QR code when encoded as Base32 — often the difference between a readable 21×21 QR and a crowded 33×33 one.

// DECODING GOTCHAS

  • > Case-sensitivity — Base64 decoders reject mis-cased input (SGVsbG8=sgvSbg8=). Base32 decoders typically normalise to uppercase before lookup, so they tolerate any case.
  • > Padding — standard Base64 requires = padding; JWT/base64url forbids it; Base32 makes it optional (RFC 4648 §6). Always check what your decoder expects.
  • > Whitespace — MIME Base64 is line-wrapped at 76 columns with CR-LF. Many decoders tolerate whitespace, some don't. Strip before passing to atob() or its equivalent.
  • > Alphabet collisions — feeding Base64 into a Base32 decoder (or vice versa) looks like it might work at first — A, B, C are valid in both — and then fails silently when it hits + or =.
  • > Crockford Base32 is a non-RFC variant used by Stripe IDs and some blockchain systems. It uses a different alphabet (excluding I, L, O, U) and supports check digits. Don't confuse it with RFC 4648 Base32.
  • > Base32 Extended Hex (RFC 4648 §7) — an alternative Base32 alphabet that preserves lexicographic sort order. Used in DNSSEC NSEC3 records. Easy to mix up with standard Base32.

// A DECISION FLOWCHART

Does the output go into a URL, cookie, filename, DNS, JWT, or OAuth flow?
│
├─ Yes  →  Does a human need to type or speak the string?
│         │
│         ├─ Yes  →  Base32 (uppercase, no ambiguous chars)
│         │         e.g., TOTP seeds, recovery codes
│         │
│         └─ No   →  Base64URL (more compact, URL-safe)
│                   e.g., JWT, short tokens, hash-named files
│
└─ No  →  Does the output go into a QR code that must stay tiny?
          │
          ├─ Yes  →  Base32 (fits in QR alphanumeric mode)
          │
          └─ No   →  Standard Base64
                    e.g., email MIME, PEM, HTTP Basic, data URI

// WHY NOT BASE16 (HEX)?

Hexadecimal (RFC 4648 §8 calls it Base16) is the universal fallback. Every shell tool, debugger, and protocol understands it. It's case-insensitive, trivially human-readable, and trivial to implement. But it doubles the size of your payload (100% overhead), which is why it is only used for small fixed-length identifiers: SHA-256 hashes (64 hex chars = 32 bytes), UUIDs (32 hex chars = 16 bytes), MAC addresses, memory addresses in debuggers.

For anything larger than a few dozen bytes, the size cost of hex is genuinely painful over the wire. Base64 is 50% smaller than hex, Base32 is 25% smaller. That is why Base64 won the email and web-embed use cases, while hex stayed in debugging and hash-display scenarios.

// WHAT ABOUT BASE58 / BASE62 / BASE85?

Non-RFC base-N encodings exist for specific niches:
Base58 — Bitcoin addresses, Flickr photo IDs. Excludes the four ambiguous characters (0, O, I, l) and +/ for easier human transcription. ~37% overhead.
Base62 — URL shorteners, Twitter snowflake IDs. Uses only alphanumerics (no special chars), safe in URLs without escaping. ~34% overhead.
Base85 / Ascii85 / Z85 — PostScript, PDF, older ZeroMQ frames. ~25% overhead (denser than Base64) but with tricky character choices that can cause XML/JSON escaping hell.

These are interesting, but they are not RFC 4648 standardised. If you are shipping a public protocol, Base64 or Base64URL is almost always the safer default because every language's standard library already supports them. If you pick Base58, you'll be shipping a dependency to every consumer.

// TRY THEM SIDE BY SIDE

Base64 encoder — standard alphabet with a URL-safe toggle
Base64URL encoder — URL-safe with padding stripped
Base32 encoder — RFC 4648 §6 alphabet
Base16 (hex) encoder — for comparison
Base58 encoder — Bitcoin-style human-friendly alphabet

Further reading:
What is Base64 and how does it work?
URL-safe vs standard Base64 — the full story
Base64 for UTF-8 and Unicode