Text Categorization

StJds (StJd table)


    A table (file) stores the testing data set (4000 MEDLINE Journals, 1.3M records) of JD scores for each ST based on:

    • word frequency count
    • document count for word
    This file is pre-generated, read in, loaded into RAM, and then used to perform ST indexing (real-time) on text. Currently, this table is generated based on Susanne's data. New programs are to be developed to generate the data from scratch (new test set).

  • Description:

    This Java class is to read in St-Jd table from a file and load to a java Object. This java object provides basic method to set and get JD scores for a specified ST.

  • Inputs: The format of this file, stJdTable.txt, is:
    STJD indexWord scoresDocument scoresJD IdJD Full Name

  • Java Files:
    • StJd.java
    • StJds.java

  • Algorithm:
    • Read in file and save St-Jd scores into Java objects, StJds.