LZ77 Compression | Dictionary-Based Data Compression

>> features

[DICTIONARY]

Dictionary Coding

References repeated patterns using offset and length.

[SLIDING]

Sliding Window

Maintains a moving window for pattern matching.

[FOUNDATION]

Algorithm Base

Foundation for ZIP, GZIP, DEFLATE, and PNG.

>> technical info

How LZ77 Works

LZ77 maintains a sliding window of recently seen data. For each position, it searches the window for the longest matching substring. If found, it outputs a reference (offset, length, next character). If not found, it outputs a literal (0, 0, character). This replaces repeated patterns with short references, achieving compression.

LZ77 Example

Text: "ABCABCABC"
Window=4, Lookahead=4

Position 0: 'A' - no match
  → (0,0,A)

Position 1: 'B' - no match
  → (0,0,B)

Position 2: 'C' - no match
  → (0,0,C)

Position 3: 'ABC' - matches at offset 3
  → (3,3,A)

Position 7: 'BC' - matches at offset 3
  → (3,2,)

Output: (0,0,A)(0,0,B)(0,0,C)(3,3,A)(3,2,)

Why Use LZ77

Lossless compression
No prior knowledge needed
Adaptive to data
Simple implementation
Basis for modern algorithms

>> frequently asked questions

What is LZ77?

LZ77 is a universal lossless data compression algorithm published by Abraham Lempel and Jacob Ziv in 1977. It achieves compression by replacing repeated occurrences of data with references to a single copy existing earlier in the uncompressed data stream.

How does the sliding window work?

The sliding window consists of two parts: a search buffer (recently processed data) and a lookahead buffer (data to be compressed). The algorithm searches the search buffer for the longest match with the beginning of the lookahead buffer, then outputs a reference or literal.

LZ77 vs modern compression?

LZ77 is the foundation of many modern compression algorithms. DEFLATE (used in ZIP and GZIP) combines LZ77 with Huffman coding. LZSS improves LZ77 by eliminating redundancy. LZ78 and LZW build dictionaries differently but share the same principles.

What are optimal window sizes?

Typical window sizes range from 32KB to 64KB for general data. Larger windows find more matches but require more memory and time. DEFLATE uses 32KB. The lookahead buffer is typically 256-512 bytes. The best size depends on your data's repetition patterns.

[Base64.sh]

> lz77 | sliding | window <