encode | decode | compress

> rice | adaptive | compress <

// Rice Coding - Adaptive integer compression with tunable parameter

0 chars
0 chars

>> features

[ADAPTIVE]

Tunable Parameter

Adjust k to optimize for different data distributions.

[EFFICIENT]

Geometric Data

Optimal for data with geometric or exponential distribution.

[SIMPLE]

Fast Coding

Simple division and remainder operations for speed.

>> technical info

How Rice Coding Works

Rice coding divides each integer n by 2^k to get quotient q and remainder r. The quotient is encoded in unary (q ones followed by a zero), and the remainder is encoded in k binary bits. This creates a variable-length code that adapts to the data distribution through the k parameter.

Rice Coding Example (k=2)

k=2, M=2^2=4

0 → q=0, r=0 → 0|00 → 000
1 → q=0, r=1 → 0|01 → 001
2 → q=0, r=2 → 0|10 → 010
3 → q=0, r=3 → 0|11 → 011
4 → q=1, r=0 → 10|00 → 1000
5 → q=1, r=1 → 10|01 → 1001
6 → q=1, r=2 → 10|10 → 1010
7 → q=1, r=3 → 10|11 → 1011
8 → q=2, r=0 → 110|00 → 11000

Larger k: fewer unary bits, more binary bits
Smaller k: more unary bits, fewer binary bits

Why Use Rice Coding

  • Adaptive to data distribution
  • Simple to implement
  • Fast encode/decode
  • Good for sensor data
  • Efficient for small integers

>> frequently asked questions

What is Rice coding?

Rice coding is a variable-length entropy encoding method that's particularly efficient for geometric distributions. It's a special case of Golomb coding where the divisor M is restricted to powers of 2 (M = 2^k), making it faster to compute using bit shifts.

How to choose the k parameter?

The optimal k depends on your data distribution. For data with mean μ, the optimal k ≈ log₂(μ × ln(2)). Small k (0-2) works well for very small numbers, while larger k (4-8) is better for data with larger values. Use the analyze function to find the optimal k for your data.

Rice vs Golomb coding?

Rice coding is a subset of Golomb coding where M = 2^k. This restriction makes Rice coding faster (using bit shifts instead of division) but potentially less optimal. Golomb can choose any M value for better compression, while Rice trades some compression efficiency for speed.

Where is Rice coding used?

Rice coding is widely used in lossless audio compression (FLAC, ALAC), image compression (JPEG-LS), and data from sensors with geometric distributions. It's particularly effective when data values are small non-negative integers with exponentially decreasing probability.

Rice coding in FLAC audio — how does it actually compress?

<strong>FLAC</strong> (Free Lossless Audio Codec) is the canonical Rice-coding success story. The pipeline:<br>1. Split audio into small blocks (typically 4096 samples).<br>2. Fit a <strong>linear predictor</strong> (LPC) to each block; subtract the prediction from the actual samples to get <strong>residuals</strong>.<br>3. Residuals have a roughly <strong>Laplace / geometric distribution</strong> — peaked at 0, tails falling off exponentially. Exactly what Rice coding is optimal for.<br>4. Choose an adaptive <code>k</code> parameter per partition (<code>k = round(log2(mean / log(2)))</code>).<br>5. Encode each residual with Rice(<code>k</code>): unary quotient + <code>k</code>-bit remainder.<br><br>For well-predicted audio, most residuals are small integers (±1 to ±64). Rice encodes these in 5-8 bits instead of the 16-24 bits the raw sample would need. Compression ratio: 50-60% of original, fully lossless. ALAC (Apple Lossless) uses a near-identical scheme.

How do I map signed integers into Rice coding?

Rice coding natively handles only non-negative integers. For signed data (like audio residuals which can be negative), use <strong>ZigZag encoding</strong> or the equivalent sign/magnitude interleave first:<br><code>zigzag(n) = (n &lt;&lt; 1) ^ (n &gt;&gt; 31)</code> (for 32-bit)<br>This maps:<br>• 0 → 0<br>• -1 → 1<br>• 1 → 2<br>• -2 → 3<br>• 2 → 4<br>• -3 → 5<br>• ...<br>Small-magnitude negatives and positives both map to small non-negative integers, preserving the geometric distribution. After Rice decoding, reverse with <code>(z &gt;&gt; 1) ^ -(z &amp; 1)</code>. This is exactly how <strong>Protocol Buffers</strong> encode signed varints, and how FLAC and JPEG-LS handle negative residuals.

Rice vs Elias gamma/delta vs varint — which integer code wins?

All are prefix-free codes for non-negative integers, but they optimize for different distributions:<br>• <strong>Rice</strong>: <code>k</code>-tunable. Optimal for geometric distributions with mean = <code>2^k</code>. <em>Best when you can tune k to match the data.</em><br>• <strong>Elias gamma</strong>: <code>⌊log2(n)⌋+1</code> unary + <code>⌊log2(n)⌋</code> binary bits. <em>Best when distribution is heavy-tailed (many small + few huge values).</em><br>• <strong>Elias delta</strong>: gamma-encoded length + binary body. <em>Best for very heavy tails.</em><br>• <strong>Varint (LEB128)</strong>: 7 data bits per byte + continuation bit. <em>Best for bytewise alignment; used in Protobuf, MIDI, DWARF.</em><br>• <strong>Exp-Golomb</strong>: Generalized Elias gamma with parameter <code>k</code>. Used in H.264/H.265 for syntax elements.<br><br>Rule of thumb: tune <strong>Rice</strong> when you know your data's mean; use <strong>Elias gamma</strong> when you don't; use <strong>varint</strong> when you need bytewise alignment over bitwise packing.