Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

The SPECIALIST Lexicon

Precision, Recall, and F1 Analysis for LMW Candidates from SpVar Model - Paper on SpVar

I. Introduction

In the previous study (AMIA paper), multiple MES and ES models are used to retrieved SpVar from the distilled MEDLINE n-gram set and then use as filter to retrieve LMWs. This model is OK with some issues:

  • Performance and Frequency:
    Due to the complexity of the algorithm, the performance is very slow. Thus, we have to reduce the size of MEDLINE n-gram set by applying high frequency (WC = 150). With this reduction, it took weeks for the program to complete the process.
  • The precision on the bench mark test on Lexicon.2015 is only 52.65% (even the recall reach 99.72%). This is OK to used as additonal filter to the high precision (ACR) matcher. However, it can not be used as stand long matcher to retrieve high precision LMW candidates.
  • LMWs with WC less than 150 are missing

An improved model, M2CES, is developed to address these issues.

II. Development

  • Compare to Table-3 in 2016 (ACR) paper final
    CaseDescriptionTPFPT. RetrievedT. RelevantFNTNPrecisionRecallF1Accuracy
    31Parenthetic Acronym - Gold Standard14,8051,87016,67514,805000.88791.00000.94060.8879
    Filters or a single matcher
    33SpVar Matcher - MES + ES, WC: 1507,5094827,99114,8057,2961,3880.93970.50720.65880.5336
    33ASpVar Matcher - M2CES, WC: 1503,6952263,92114,80511,1101,6440.94240.24960.39460.3202
    33BSpVar Matcher - M2CES, WC: 1004,4852834,76814,80510,3201,5870.94060.30290.45830.3641
    33CSpVar Matcher - M2CES, WC: 505,7484216,16914,8059,0571,4490.93180.38820.54810.4316
    33DSpVar Matcher - M2CES, WC: 306,6825137,19514,8058,1231,3570.92870.45130.60750.4821
    Combination: SpVar + CUI + Distrilled
    36SpVar - MES + ES, WC: 1505,5102065,71614,8059,2951,6640.96400.37220.53700.4302
    36ASpVar - M2CES, WC: 1502,7931062,89914,80512,0121,7640.96340.18870.31550.2733
    36BSpVar - M2CES, WC: 1003,3191183,43714,80511,4861,7520.96570.22420.36390.3041
    36CSpVar - M2CES, WC: 504,1021624,26414,80510,7031,7080.96200.27710.43020.3484
    36DSpVar - M2CES, WC: 304,6991894,88814,80510,1061,6810.96130.31740.47720.3826
    Combination: SpVar + CUI + Distrilled
    37SpVar - MES + ES, WC: 1507271173814,80514,0781,8590.98510.04910.09350.1551
    37ASpVar - M2CES, WC: 150354535914,80514,4511,8650.98610.02390.04670.1331
    37BSpVar - M2CES, WC: 100427543214,80514,3781,8650.98840.02880.05600.1375
    37CSpVar - M2CES, WC: 50568857614,80514,2371,8620.98610.03840.07390.1457
    37DSpVar - M2CES, WC: 306541066414,80514,1511,8600.98490.04420.08460.1508