Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
Lexicon Words Stats
I. Introduction
This page describes programs to get stats of Lexicon words, using MEDLINE for frequency (WC|DC).
II. Detail Process
Step | Description | Inputs | Outputs | Notes | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MEDLINE Unigram Spectrum Analysis | ||||||||||||||
1 | Group raw unigram by core-term.lc
|
|
|
| ||||||||||
2 | Get MEDLINE unigram WC Frequency Spectrum
|
|
|
| ||||||||||
Lexicon Word Spectrum Analysis | ||||||||||||||
10 | Get Lexicon single word frequency spectrum
|
|
| |||||||||||
11 | Group distilled n-gram set by core-term.lc
|
|
|
| ||||||||||
12 | Get all words frequency spectrum
|
|
| |||||||||||
13 | Get multiwords frequency spectrum
|
|
| |||||||||||
Lexicon Word Histgram Analysis (Used in Amia Paper) | ||||||||||||||
20 | Get normTerm.lc from inflVars
|
|
|
| ||||||||||
21 | Split single word and multiwords from lexicon inflVars
|
|
|
| ||||||||||
22 | Add WC to Lexicon single word
|
|
| |||||||||||
23 | Add WC to Lexicon multiword
|