Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
Precision, Recall, and F1 Analysis for LMW Candidates from (ACR) Model - Paper On (ACR) matcher
I. Introduction
All multiwords (LMWs) from an interested domain must be identified to find the recall rate. In this analysis, we used LMW candidates from the Parentheic Acronym Pattern matcher (ACR) to calculate precision, recall, and F1 score. The example illustrated below is based on 2015 data.
II. Data-1 (Table-3 in 2016 AMIA paper initial version)
Case | Description | TP | FP | T. Retrieved | T. Relevant | Precision | Recall | F1 |
---|---|---|---|---|---|---|---|---|
1 | Parenthetic Acronym - Gold Standard | 13170 | 1230 | 14400 | 13170 | 0.9146 | 1.0000 | 0.9554 |
Filters or a single matcher | ||||||||
2 | Distilled MEDLINE N-gram Set (16 filters) | 13165 | 795 | 13960 | 13170 | 0.9431 | 0.9996 | 0.9705 |
3 | Spelling Variant Pattern Matcher | 6837 | 293 | 7130 | 13170 | 0.9589 | 0.5191 | 0.6736 |
4 | Metathesaurus CUI Pattern matcher | 8678 | 512 | 9190 | 13170 | 0.9443 | 0.6589 | 0.7762 |
5 | EndWord Pattern matcher | 1587 | 108 | 1695 | 13170 | 0.9363 | 0.1205 | 0.2135 |
Combination of filters and matchers | ||||||||
6 | SpVar + CUI + Distrilled | 5108 | 129 | 5237 | 13170 | 0.9754 | 0.3879 | 0.5550 |
7 | SpVar + CUI + EndWord + Distrilled | 703 | 5 | 708 | 13170 | 0.9929 | 0.0534 | 0.1013 |
III. Data-2 (Table-3 in 2016 AMIA paper final)
Case | Description | TP | FP | FN | TN | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|---|---|
1 | Parenthetic Acronym - Gold Standard | 14805 | 1870 | 0 | 0 | 0.8879 | 1.0000 | 0.9406 | 0.8879 |
Filters or a single matcher | |||||||||
2 | Distilled MEDLINE N-gram Set (16 filters) | 14796 | 1305 | 9 | 565 | 0.9189 | 0.9994 | 0.9575 | 0.9212 |
3 | Spelling Variant Pattern Matcher | 7509 | 482 | 7296 | 1388 | 0.9397 | 0.5072 | 0.6588 | 0.5336 |
4 | Metathesaurus CUI Pattern matcher | 9488 | 752 | 5317 | 1118 | 0.9266 | 0.6409 | 0.7577 | 0.6360 |
5 | EndWord Pattern matcher (top 20) | 1710 | 180 | 13095 | 1690 | 0.9048 | 0.1155 | 0.2049 | 0.2039 |
Combination of filters and matchers | |||||||||
6 | SpVar + CUI + Distrilled | 5510 | 206 | 9295 | 1664 | 0.9640 | 0.3722 | 0.5370 | 0.4302 |
7 | SpVar + CUI + EndWord (20) + Distrilled | 727 | 11 | 14078 | 1859 | 0.9851 | 0.0491 | 0.0935 | 0.1551 |
III. Data-3 (Table-2 in 2017 HealthInf paper final)
Case | Description | TP | FP | FN | TN | Precision | Recall | F1 | Accuracy |
---|---|---|---|---|---|---|---|---|---|
1 | Parenthetic Acronym - Gold Standard | 15850 | 1857 | 0 | 0 | 0.8951 | 1.0000 | 0.9447 | 0.8951 |
Filters or a single matcher | |||||||||
2 | Distilled MEDLINE N-gram Set (16 filters) | 15840 | 1299 | 10 | 558 | 0.9242 | 0.9994 | 0.9603 | 0.9261 |
3 | Spelling Variant Pattern Matcher | 8094 | 499 | 7756 | 1358 | 0.9419 | 0.5107 | 0.6623 | 0.5338 |
4 | Metathesaurus CUI Pattern matcher | 10056 | 755 | 5794 | 1102 | 0.9302 | 0.6344 | 0.7544 | 0.6301 |
5 | EndWord Pattern matcher (top 20) | 1804 | 178 | 14046 | 1679 | 0.9102 | 0.1138 | 0.2023 | 0.1967 |
5A | EndWord Pattern matcher (top 33) | 2346 | 251 | 13504 | 1606 | 0.9034 | 0.1408 | 0.2544 | 0.2232 |
Combination of filters and matchers | |||||||||
6 | SpVar + CUI + Distrilled | 5892 | 212 | 9958 | 1645 | 0.9653 | 0.3717 | 0.5368 | 0.4257 |
7 | SpVar + CUI + EndWord (20) + Distrilled | 777 | 11 | 15073 | 1846 | 0.9860 | 0.0490 | 0.0934 | 0.1481 |
7A | SpVar + CUI + EndWord (33) + Distrilled | 992 | 15 | 14858 | 1842 | 0.9851 | 0.0626 | 0.1177 | 0.1600 |
8 | CUI + EndWord (33) + Distrilled | 1766 | 113 | 14084 | 1744 | 0.9399 | 0.1114 | 0.1992 | 0.1982 |
IV. Automatic Tagging Model
Tag | Notes |
---|---|
valid |
|
invalid |
|
tbd |
|