// Comma Code - Self-delimiting binary codes with automatic boundaries
No separators needed between consecutive codes.
Unary length followed by data bits.
Taboo variant avoids specific bit patterns.
Comma code encodes integer n by: 1) Writing the bit length L-1 in unary (L-1 ones followed by zero), 2) Appending the binary representation without the leading 1. The taboo variant modifies the encoding to avoid patterns like '11', useful in certain communication channels.
Basic Comma Code: 0 → 0 (special case) 1 → 01 (0 ones + 0 + empty) 2 → 100 (1 one + 0 + '0') 3 → 101 (1 one + 0 + '1') 4 → 11000 (2 ones + 0 + '00') 5 → 11001 (2 ones + 0 + '01') Concatenated: 1,2,3 → 01 100 101 → 01100101 Self-delimiting - can decode without separators Taboo variant avoids '11' pattern: Uses different encoding to prevent consecutive 1s
Comma code is a self-delimiting binary code that encodes integers using their bit length in unary followed by the data bits. It's called 'comma' because codes can be concatenated without explicit separators, like items in a list.
The unary length prefix tells the decoder exactly how many data bits follow. When you see k ones followed by a zero, you know to read exactly k more bits. This allows multiple codes to be concatenated without separators.
Taboo comma code modifies the encoding to avoid certain bit patterns (like '11'). This is useful in channels where specific patterns cause problems or have special meanings, such as synchronization markers.
Comma codes are used in data compression, network protocols, and storage systems where self-delimiting properties are valuable. They're particularly useful when you need to store multiple variable-length integers without length fields.