Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov
Factor Analysis Results
I. Error Types
Correction Type | Details |
PreCorrection |
|
Dictionary-based Correction
|
|
Combination | TBD |
Correction Type | Details |
Not in checkDic, Not Correct |
|
II. Analysis Results
The results on baseline data are shown belows:
Results | Jazzy | Baseline | Medline | Lexicon | Lexicon.E* | Combo1** | Combo2*** |
---|---|---|---|---|---|---|---|
Performance (by Baseline program) | |||||||
TP|Ret.|Rel. Precision, Recall, F1 |
|
|
|
|
|
|
|
Tagged terms (833), should be corrected | |||||||
B2.1. DicCorr (T) | 227 (48.5043%) | 232 (49.5726%) | 205 (43.8034%) | 234 (50.0000%) | 235 (50.2137%) | 226 (48.2906%) | 210 (44.8718%) |
B2.2. DicCorr (F) | 241 (51.4957%) | 236 (50.4274%) | 263 (56.1966%) | 234 (50.0000%) | 233 (49.7863%) | 242 (51.7094%) | 258 (55.1282%) |
Tag issue: re-check the annotation | |||||||
B2.2.1. Not detect, real-word (error tag) | 36 (7.6923%) | 49 (10.4701%) | 43 (9.1880%) | 50 (10.6838%) | 50 (10.6838%) | 50 (10.6838%) | 50 (10.6838%) |
Detection issue: Check dictionary + exception algorithm | |||||||
B2.2.2. Not detect, spelling error (non-word) | 20 (4.2735%) | 54 (11.5385%) | 76 (16.2393%) | 57 (12.1795%) | 57 (12.1795%) | 57 (12.1795%) | 85 (18.1624%) |
Candidate issue: edit distance + phonetic + Suggesting dictionary | |||||||
B2.2.3. Detect, not candidates by edit-distance | 37 (7.9060%) | 34 (7.2650%) | 29 (6.1966%) | 32 (6.8376%) | 32 (6.8376%) | 32 (6.8376%) | 28 (5.9829%) |
B2.2.4. Detect, not candidates by suggestion Dic | 79 (16.8803%) | 11 (2.3504%) | 19 (4.0598%) | 17 (3.6325%) | 20 (4.2735%) | 15 (3.2051%) | 15 (3.2051%) |
B2.2.5. Detect, not candidates by multi-corrections | 2 (0.4274%) | 6 (1.2821%) | 13 (2.7778%) | 5 (1.0684%) | 5 (1.0684%) | 6 (1.2821%) | 6 (1.2821%) |
Ranking issue: in candidate list | |||||||
B2.2.6. Detect, Candidates, wrong (not top) rank | 62 (13.2479%) | 75 (16.0256%) | 77 (16.4530%) | 65 (13.8889%) | 57 (12.1795%) | 75 (16.0256%) | 69 (14.7436%) |
B2.2.7. Detect, Candidates, wrong top rank | 5 (1.0684%) | 7 (1.4957%) | 6 (1.2821%) | 8 (1.7094%) | 12 (2.5641%) | 7 (1.4957%) | 5 (1.0684%) |
Valid word (not-tagged), but not in checkDic, corrected wrong | |||||||
A2.2.1. Not in checkDic, corrected wrong, by Dic | 1912 (7.8287%) | 139 (0.5691%) | 121 (0.4954%) | 143 (0.5855%) | 137 (0.5609%) | 70 (0.2866%) | 51 (0.2088%) |
A2.2.2. Not in checkDic, corrected wrong, by Pre | 41 (0.1679%) | 33 (0.1351%) | 27 (0.1106%) | 31 (0.1269%) | 31 (0.1269%) | 31 (0.1269%) | 26 (0.1065%) |
Summary | |||||||
Check Dic B2.2.2+A2.2.1+A2.2.2 | 1973 | 226 | 224 | 231 | 225 | 158 | 162 |
Sugg Dic B2.2.3+B2.2.3+B2.2.4 | 118 | 51 | 61 | 54 | 57 | 53 | 49 |
edit distance | instance | percentage | Accu. percentage |
---|---|---|---|
1 | 317 | 67.74% | 67.74% |
2 | 110 | 23.50% | 91.24% |
3 | 24 | 5.13% | 96.37% |
4 | 8 | 1.71% | 98.08% |
5 | 6 | 1.28% | 99.36% |
6 | 2 | 0.43% | 99.79% |
7 | 1 | 0.21% | 100.00% |