> huffman | optimal | codes <
// Huffman Coding - Optimal prefix-free compression algorithm
>> features
Optimal Codes
Generates shortest average code length for given frequencies.
Prefix-Free
No code is a prefix of another, enabling unique decoding.
Frequency-Based
Common characters get shorter codes automatically.
>> technical info
How Huffman Coding Works
Huffman coding builds a binary tree from the bottom up. Starting with character frequencies, it repeatedly combines the two lowest-frequency nodes into a parent node until only the root remains. Left branches are '0', right branches are '1'. Characters with higher frequency get shorter codes, achieving optimal compression for the given frequency distribution.
Huffman Coding Example
Text: "AABBBCCCC"
Frequencies:
A: 2
B: 3
C: 4
Tree Building:
1. Combine A(2) + B(3) = AB(5)
2. Combine C(4) + AB(5) = Root(9)
Tree:
Root(9)
/ \
C(4) AB(5)
/ \
A(2) B(3)
Codes:
C: 0
A: 10
B: 11
Encoded: 10 10 11 11 11 0 0 0 0
= 1010111111 0000
Why Use Huffman Coding
- Optimal compression ratio
- Simple implementation
- Foundation of ZIP/JPEG
- No data loss
- Proven efficiency
>> frequently asked questions
What is Huffman coding?
Huffman coding is an optimal prefix-free compression algorithm developed by David A. Huffman in 1952. It assigns variable-length codes to characters based on their frequency, with more frequent characters getting shorter codes, achieving the minimum average code length possible.
Why is it called prefix-free?
A code is prefix-free when no codeword is a prefix of another. For example, if 'A' is '01', then no other character can have a code starting with '01'. This property ensures unambiguous decoding - you can always tell where one code ends and another begins without separators.
How efficient is Huffman coding?
Huffman coding achieves the optimal compression for a given frequency distribution, typically reaching within 1 bit of the theoretical entropy limit. For English text, it usually achieves 60-70% compression. It's most effective when character frequencies vary significantly.
Where is Huffman coding used?
Huffman coding is used in JPEG image compression, ZIP file compression (combined with LZ77 in DEFLATE), MP3 audio, and many other formats. It's often combined with other algorithms - for example, JPEG uses Huffman after DCT and quantization, while ZIP uses it after LZ77.