PreProcess: STs (Semantic Types)
- Description:
A set of 135 Semantic Types in the Semantic Network in NLM's Unified Medical Language System (UMLS) is used for STI. Concepts in the UMLS Metathesaurus are assigned to one or more STs which semantically characterize those concepts. For example, concept Aspirin is assigned the STs [Pharmacologic Substance] and [Organic Chemical]. Each Semantic Type has a ID and abbreviation. They are called StId and StAbbr, respectively. This information can be derived from the latest MRSTY.
- Input:
- MRSTY (Semantic types list)
- SRDEF (Semantic types Abbreviations)
- Java File & Algorithm:
- GenerateStFromMrSty.java
- Read cui, Id, and name from MRSTY
- Update Semantic Type list
- Read TUI and ST abbreviation from SRDEF
- Map TUI to ST abbreviations
- Print out Semantic type with "|" as field separator
- Output File:
sts.txt:
- Notes:
Semantic Types can be automatically generated from MRSTY by following script:
- shell> flds 2,3 MRSTY | sort -u > SemanticType
- shell> Manually add in index/abbreviation for each ST (usually at the end of the file)
or
- Get SRDEF.txt
- shell> fgrep "STY|T" SRDEF.txt > SRDEF.STY.txt
- shell> flds 1,2,3,9 SRDEF.STY.txt > STY.txt
- shell> Manually add in index for each ST