Huffman Coding | Optimal Prefix-Free Compression

>> features

[OPTIMAL]

Optimal Codes

Generates shortest average code length for given frequencies.

[PREFIX-FREE]

Prefix-Free

No code is a prefix of another, enabling unique decoding.

[ADAPTIVE]

Frequency-Based

Common characters get shorter codes automatically.

>> technical info

How Huffman Coding Works

Huffman coding builds a binary tree from the bottom up. Starting with character frequencies, it repeatedly combines the two lowest-frequency nodes into a parent node until only the root remains. Left branches are '0', right branches are '1'. Characters with higher frequency get shorter codes, achieving optimal compression for the given frequency distribution.

Huffman Coding Example

Text: "AABBBCCCC"

Frequencies:
A: 2
B: 3
C: 4

Tree Building:
1. Combine A(2) + B(3) = AB(5)
2. Combine C(4) + AB(5) = Root(9)

Tree:
     Root(9)
    /      \
   C(4)    AB(5)
          /     \
        A(2)    B(3)

Codes:
C: 0
A: 10
B: 11

Encoded: 10 10 11 11 11 0 0 0 0
= 1010111111 0000

Why Use Huffman Coding

Optimal compression ratio
Simple implementation
Foundation of ZIP/JPEG
No data loss
Proven efficiency

>> frequently asked questions

What is Huffman coding?

Huffman coding is an optimal prefix-free compression algorithm developed by David A. Huffman in 1952. It assigns variable-length codes to characters based on their frequency, with more frequent characters getting shorter codes, achieving the minimum average code length possible.

Why is it called prefix-free?

A code is prefix-free when no codeword is a prefix of another. For example, if 'A' is '01', then no other character can have a code starting with '01'. This property ensures unambiguous decoding - you can always tell where one code ends and another begins without separators.

How efficient is Huffman coding?

Huffman coding achieves the optimal compression for a given frequency distribution, typically reaching within 1 bit of the theoretical entropy limit. For English text, it usually achieves 60-70% compression. It's most effective when character frequencies vary significantly.

Where is Huffman coding used?

Huffman coding is used in JPEG image compression, ZIP file compression (combined with LZ77 in DEFLATE), MP3 audio, and many other formats. It's often combined with other algorithms - for example, JPEG uses Huffman after DCT and quantization, while ZIP uses it after LZ77.

[Base64.sh]

> huffman | optimal | codes <