Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

The SPECIALIST Lexicon

Parenthetic Acronym Pattern

I. Introduction
Parenthetic Acronym Pattern includes paterns of (ACRONYM) and (ACRONYMs). It is used in exclusive filter to exclude invalid MWE from n-gram set. On the other hand, it can be used as inclusive filter for MWE candidates.

II. Processes:

  • directory: ${MULTIWORDS_DIR}/bin
  • program: 6.Acronyms
  • Run program: shell> 6.Acronyms ${YEAR}
  • Processes:

    StepDescriptionIONotes - Examples
    1Get n-gram matches (ACR) patternInputs:
    • nGram.2014

    Outputs:

    • ApplyFilters.rpt.1.parAcr.trap
    • ApplyFilters.rpt.1.parAcr.exp
    • ApplyFilters.rpt
    trap - match (ACR) pattern:
    • Balkan endemic nephropathy (BEN)
    • zone of polarizing activity (ZPA)
    exp - not match (ACR) pattern:
    • & Systems Pharmacology (2013)
    • zonula occludens-1 (ZO-1)
    2Get acronym|expansion from step-1Inputs:
    • ApplyFilters.rpt.1.parAcr.trap

    Outputs:

    • acronyms.txt
    Convert n-gram from the format of
    ".. acronym expansion (ACR) .." to
    "ACR|acronym expansion"
    3Get new acronym|expansion (not in Lexicon) from step-2Inputs:
    • acronyms.txt

    Outputs:

    • acronyms.txt.pass
    • acronyms.txt.trap
    trap - in the Lexicon:
    • SSS|Stanford Sleepiness Scale
    • WHO|World Health Organisation
    pass - new acronyms (candidates):
    • BLS|Bureau of Labor Statistics
    • MH|World Mental Health

III. Results:
For 2014 release:

  1. 17023819 n-grams
  2. 163714 n-grams matches (ACR) pattern
  3. 1646 are identified as valid acronym|expansion
  4. 636 are new, used as multiword candidates to add to Lexicon
  5. Tags:
    TagOEYN
    Description
    • Invalid expansion
    • Valid expansion
    • Exist in Lexicon
    • Valid expansion
    • Not in Lexicon
    • Valid MWE
    • Valid expansion
    • Not in Lexicon
    • invalid MWE
    CountTBDTBDTBDTBD