Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

The SPECIALIST Lexicon

Dash-Space Spelling Variants

I. Introduction

Dash-Space spVars are one of the common spVars. Dash-space spVars are a subset of spVars. Accordingly, they must meet the criteria of spVars (same meaning, POS, syntax, and pronunciation) as well as match the dash-space pattern:

  • DashSpace pattern:
    • [xxx-yyy] and [xxx yyy]
    • [xxx-yyy] and [xxxyyy]
    • [xxx yyy] and [xxxyyy]

II. Algorithm

  • Dash-Space spvars are identified by SpVar model. Some of them are not included in Lexicon (by mistakes). A program is develped to retrieve terms with such patterns and not in Lexicon (false-positive from the SpVarNorm) to enhance the SPVarNorm algorithm.
  • Dash Pattern
    • Match the pattern of [xxx-yyy] to [xxx yyy] or [xxxyyy]
    • Exlcude if the term after the last '-' is a preposition in Lexicon
    • Must have same POS
    • Exlucde duplicates from the same EUISs (duplicated by inflections)
  • Space Pattern
    • Match the pattern of [xxx yyy] to [xxxyyy]
    • Exlcude if the last word is a preposition in Lexicon
    • Must have same POS
    • Exlucde duplicates from the same EUISs (duplicated by inflections)

III. Studies on Lexicon.2015

  • Dash Pattern:
    • 233 pairs that match dash spVar pattern from SpVarNorm (false positive) on Lexicon.2015
    • They are sent to linguist to tag [Y|N] for valid and invalid spVars inthe following format:
      POSEUI-1Term-1EUI-2Term-2Tag
    • Linguist combines EUI-1 and EUI-2 if the tag is [Y]
    • Examples:
      	noun|E0356150|anti-treponemal|E0009764|antitreponemal|y
      	noun|E0316451|gastro-cote|E0309756|gastrocote|n
      	noun|E0342133|joint-ill|E0214676|joint ill|y
      	noun|E0228131|mule-foot|E0228130|mule foot|n
      	noun|E0588600|re-flex|E0052428|reflex|n
      	verb|E0053396|re-form|E0052452|reform|n
      	verb|E0484710|re-present|E0052856|represent|n
      	noun|E0065691|writing-paper|E0339084|writing paper|y
      	noun|E0438345|yo-yo|E0686155|yoyo|n
      	...
      	

    • Space Pattern:

    • 58 pairs that match space spVar pattern from SpVarNorm (false positive) on Lexicon.2015
    • They are sent to linguist to tag [Y|N] for valid and invalid spVars inthe following format:
      POSEUI-1Term-1EUI-2Term-2Tag
    • Linguist combines EUI-1 and EUI-2 if the tag is [Y] (40)
    • Examples:
      	noun|E0220762|air bed|E0007871|airbed|Y
      	noun|E0525866|art glass|E0525865|artglass|N
      	noun|E0347021|bush dog|E0228170|bushdog|Y
      	noun|E0565815|crab tree|E0565814|crabtree|N
      	noun|E0509111|green stone|E0509110|greenstone|N
      	noun|E0227977|winter green|E0070850|wintergreen|N
      	...