// NYSIIS - High-accuracy phonetic encoding for name matching and deduplication
More accurate than Soundex for general name matching.
Generates consistent 6-character phonetic codes.
Used by NY State criminal justice systems.
NYSIIS (New York State Identification and Intelligence System) is a phonetic encoding algorithm developed in 1970 by Robert L. Taft. It improves upon Soundex by using more sophisticated rules for handling name variations. The algorithm applies a series of transformation rules to convert names into 6-character phonetic codes, with special handling for common prefixes, suffixes, and letter combinations found in surnames.
Name transformations:
Johnson � JANSAN
Jonsen � JANSAN
Jensen � JANSAN
Williams � WALAN
Wiliams � WALAN
Willems � WALAN
Special cases:
MacDonald � MCDANALD
Knudsen � NNADSAN
Schmidt � SSNAT
Phillips � FFALAP
Key rules:
- MAC � MCC
- KN � NN
- PH � FF
- SCH � SSS
- Vowels � A
NYSIIS (New York State Identification and Intelligence System) is a phonetic encoding algorithm developed in 1970 for the New York State criminal justice system. It's designed to match surnames that sound similar but have different spellings.
NYSIIS is generally more accurate than Soundex for name matching. It uses more sophisticated rules, handles more edge cases, and produces a 6-character alphanumeric code compared to Soundex's 4-character code. Studies show NYSIIS has about 2.7% error rate compared to Soundex's 7.2%.
Modified NYSIIS is an improved version that handles additional edge cases and provides better accuracy for certain name patterns. It includes refined rules for vowel handling and improved processing of certain consonant clusters.
NYSIIS is widely used in criminal justice systems, healthcare record matching, genealogy research, and any application requiring accurate name deduplication. It's particularly effective for matching Anglo-Saxon and European surnames.