Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
N-gram Utilities
I. Introduction
Some utility software are developed for processing n-gram. They are used in many processes and are summarized in this page.
II. Detail Process
Step | Description | Inputs | Outputs | Notes |
---|---|---|---|---|
1 | Grep terms (nGrams) then sort
|
|
|
|
2 | Filter pipe (|) from nGrams
|
|
| |
3 | Group nGrams by core-term
|
|
|
|
4 | Group nGrams by norm-term
|
|
|
|
Convert from WC|core-term back to DC|WC|TERM | ||||
5 | Sort nGrams by DC|WC|Term
|
|
|
|
6 | Convert (ungroup) core-term to nGrams
|
|
|
|
Convert from WC|core-term.lc back to WC|core-term | ||||
7 | Group nGrams by core-term.lc
|
|
|
|
8 | core-term to corm-term nGrams
|
|
|
|
Group n-gram set by core-term.lc | ||||
10 | Group nGram set by core-term.lc
|
|
| |
11 | Group distilled nGram set by core-term.lc
|
|
|