Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

SPT - Terms Trie Tree Design

I. Introduction
The next two sections are used to find all multi-words/terms in a given term match terms from a list of terms. A retrieve algorithm (Trie) is applied for this purpose.

II. Trie Algorithm

  • Trie Node:
    • Trie nodes are used to compose the Trie tree.
    • Each node has a key, which is a (single) word
    • Each node has a level, the top (root) level is 0
    • Each node has a child, which is a Vector<TrieNode>

    • Use "$_ROOT" for the word of root node
    • Use "$_END" for the word of end node

  • Trie Tree:
    • Each branch (path) starts from root node to the end node is a term
    • All terms are stored in the branch of the Trie tree

    • Tokenize input term into words
    • Go through each word:
      • curNode starts from the top (root) node
      • Init curWordNode by curWord
      • Check if Childs contains curWordNode
        • if yes, update curNode to the matched node in Childs
        • If no, add curWordNode to Childs and update curNode to curWordNode

III. Java Classes

  • TrieNode.java: a Java object for TrieNode
  • TrieTree.java: a Java class for TrieTree

IV. Examples

  • Synonym Rules:

    wordsynonym
    dogcanine
    catfeline
    canineK9
    K9bull dog
    Dog and catpets
    puppy and kittypets

  • Synonym Terms:

    Terms
    dog
    canine
    cat
    feline
    k9
    bull dog
    dog and cat
    pets
    puppy and kitty

  • Trie Tree (from Synonym Terms):