Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

LVG Metaphone Algorithm

Introduction:

The Lexical tool uses the "Metaphone" phonetic code algorithm by Lawrence Philips, "Hanging on the Metaphone", Computer Language v7n12, December 1990, pp. 39-43. An input term is reduced to a 1 to 6 character code (configurable) using relatively simple phonetic rules for typical spoken English.

Basically, Metaphone reduces the alphabet to 16 consonant sounds:
B F H J K L M N P R S T W X Y 0 (zero)
where the 0 (zero) represents the 'th' sound.

Input

  • inStr: an input string.
  • maxCodeLength: the length for the transformed code to be truncated.

Output: an output code in uppercased string.

Algorithm:

I. Pre-Process

Steps DescriptionsConditions (if)Actions (then)
Initial check
  • inStr is null
  • length of inStr == 0
  • return ""
Drop non-alphabetic characters
  • not an alphabetic character
  • drop
Quick checki
  • length of current string is 0
  • return ""
Quick check
  • length of current string is 1
  • return current string
Uppercased  
  • uppercase current string

II. Initial Letter Exceptions

Steps Descriptions
    Conditions (if)
Actions (then)
Initial Exception-I String begin with:
  • KN
  • GN
  • PN
  • AE
  • WR
  • drop the first character
Initial Exception-II
  • String begin with X
  • change to S
Initial Exception-III
  • String begin with WH
  • change to W

III. Transformation by looping through the string

Steps DescriptionsConditions (if)Actions (then)
Doubled letters rules
  • doubled letter && is not C
  • drop the 2nd letter
Vowel Letters Rules
  • is not the first letter C
  • drop the letter
B Rules
  • at the end of the word && after M (MB)
  • map to B
C Rules-I
  • before IA (CIA)
  • before H && not after S (CH, not SCH)
  • map to X
C Rules-II
  • before FrontV (CE, CI, CY)
  • map to S
C Rules-III
  • other than above two C rules
  • map to K
D Rules-I
  • before G plus FrontV (DGE, DGI, DGY)
  • map to J
D Rules-II
  • other than above D rules
  • map to T
G Rules-I
  • before N (GN)
  • before NED (GNED)
  • after D && before FrontV (DGE, DGI, DGY)
  • before H (GH) && ((at the end) || (before a Consonant))
  • drop the letter
G Rules-II
  • before FrontV (GE, GI, GY)
  • map to J
G Rules-III
  • other than above G rules
  • map to K
H Rules-I
  • after Vowel
  • after Varson
  • before consonant (not vowels)
  • drop the letter
H Rules-II
  • other than above H rules
  • map to H
K Rules-I
  • after C (CK)
  • drop the letter
K Rules-II
  • other than above K rules
  • map to K
P Rules-I
  • before H (PH)
  • map to F
P Rules-II
  • other than above P rules
  • map to P
Q Rules
  • all cases
  • map to K
S Rules-I
  • before H (SH)
  • before IA (SIA)
  • before IO (SIO)
  • map to X
S Rules-II
  • other than above S rules
  • map to S
T Rules-I
  • before IA (TIA)
  • before IO (TIO)
  • map to X
T Rules-II
  • before H (TH)
  • map to 0
T Rules-III
  • before CH (TCH)
  • drop the letter
T Rules-IV
  • other than above T rules
  • map to T
V Rules
  • all cases
  • map to F
W Rules-I
  • before Consonant (not vowels)
  • drop the letter
W Rules-II
  • before Vowels
  • map to W
X Rules
  • all cases
  • map to KS
Y Rules-I
  • before Consonant (not vowels)
  • drop the letter
Y Rules-II
  • before Vowels
  • map to Y
Z Rules
  • all cases
  • map to S
defaults
  • all other cases (F, J, L, M, N, R, ...)
  • no change (mapped)
  • Vowels: A, E, I, O, U
  • FrontV: E, I, Y
  • Varson: C, G, P, S, T

References