Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

The SPECIALIST Lexicon

Test Lead-End-Unit in Exclusive Filter

I. Introduction

This section describes the testing processes and results of the exclusive filters of nonLead and nonEnd units. The Lexicon multiwords are used to tested. Ideally, the results should not filter out any multiwords (or min.) in Lexicon.

TBD:

  • This section of Java codes are moved to TBD because some files are under developement. We will need to come back to complete this sectiondue to the time constraint.

II. Processes

  • directory: ${MULTIWORDS_DIR}/bin
  • program: 3.NonLeadEndTerm
  • Run program: shell> ./3.NonLeadEndTerm ${YEAR}
  • Processes:

    StepDescriptionIONotes - Examples
    15Get ruleType on multiwords from Lexicon
    • GetLetRtForTermsInLexicon.java
    • Assign invalid Lead-End-Unit ruleTypes on Lexicon multiwords:
      • RT_INV_LEAD_TERM
      • RT_INV_END_TERM
      • RT_INV_END_ABB
      • RT_INV_LEAD_END_TERM
    Inputs:
    • ./outData/3.InvalidLeadEndTerm/lexMultiwords.data

    Outputs:

    • ./outData/3.InvalidLeadEndTerm/lexMultiwords.data.ruleType
    • ./outData/3.InvalidLeadEndTerm/lexMultiwords.data.ruleType.ilet (10)
    • 1 min.
    • Only 10 exceptions, all of them are RT_INV_END_ABB
      => Algorithm of endWord with abbreviation pattern can be improved
    16Analyze ruleType on multiwords from Lexicon
    • AnalyzeLetRtForTermsInLexicon.java
    • Analyze results from above step (10)
    • Get the precision of exclusive fitler on Lexicon
    Inputs:
    • ./3.InvalidLeadEndTerm/lexMultiwords.data.ruleType
    • ./3.InvalidLeadEndTerm/lexMultiwords.data.ruleType.exceptions

    outputs:

    • ./3.InvalidLeadEndTerm/lexMultiwords.data.ruleType.rpt
    • 5 sec.
    • precision: 99.9981%
    • 1 invalid ruleType: RT_INV_END_ABB
    17Get multiwords in Lexicon by lead/end units
    • GetLexiconMultiwordsByLeadEndTerm.java
    • Find all multiwords in Lexicon by specifying lead/end word
    Inputs:
    • ./outData/3.InvalidLeadEndTerm/lexMultiwords.data.ruleType
    Outputs:
    • ./outData/3.InvalidLeadEndTerm/LexiconMw/lexMultiwords.data.ruleType.${LEAD_END_WORD}
    • 5 sec.
    • Used for case study