The SPECIALIST Lexicon

Multiwords from Verb Complements Report

This task is to retrieve mulitwords from verb complements in the Lexicon. The Lexicon.2024 was to used for the development.. The Lexicon.2025 is the first public release. The growth report is shown as bellows..

  • Light Verb Constructions
    • Performance on Candidates:
      YearVC (Candidates) LMWs (TP)Not LMWs (FP)PrecisionRecallF1
      202453223141.51100.0058.67
      202556243242.86100.0060.00
    • Performance of WordNet on candidates:
      YearVC (Candidates) RelevantIrrelevantPrecisionRecallF1
      TPFNFPTN
      2024531210031100.0054.5570.59
      2025561410032100.0058.3373.68
    • Summary:
      • The precision for multiwords from LVCs is low (42.86).
      • Relevant: 14 of the 24 multiword terms are multiwords in the WordNet (TP) with 10 are not in the WordNet (FN).
      • Irrelevant: All (32) not multiword terms in the Lexicon are not multiwords in the WordNet (TN) and none (0) not multiword terms are in the WordNet (FP).
      • WordNet multiword tagging has a precision of 100.
  • Verb-Particle Constructions
    • Performance on Candidates:
      YearVC (Candidates) LMWs (TP)Not LMWs (FP)PrecisionRecallF1
      20242516168283466.85100.0080.13
      20252518168483466.88100.0080.15
    • Performance onf WordNet on candidates:
      YearVC (Candidates) RelevantIrrelevantPrecisionRecallF1
      TPFNFPTN
      20242516122745511871691.2372.9581.07
      20252518122945511871691.2472.9881.09
    • Summary:
      • The precision for multiwords from VPCs is decent (66.88).
      • Relevant: 1229 of the 1684 multiword terms are in the WordNet (TP) with 445 are not in the WordNet (FN).
      • Irrelevant: 716 of the not multiword term candidadtes (834) in the Lexicon are not multiwords in the WordNet (TN) and 118 not multiword term candidate are in the WordNet (FP).
      • WordNet multiword tagging has a high precision of 91.24 with 72.98 recall.
  • Verb Complements (LVCs and VPCs)
    • Performance on Candidates:
      YearVC (Candidates) LMWs (TP)Not LMWs (FP)PrecisionRecallF1
      20242569170486566.33100.0079.76
      20252574170886666.36100.0079.78
    • Performance on WordNet on Candidate:
      YearVC (Candidates) RelevantIrrelevantPrecisionRecallF1
      TPFNFPTN
      20242569123946511874791.3072.7180.95
      20242574124346511874891.3372.7881.01
    • Summary on development based on Lexicon.2024:
      • The precision for multiword generation from VCs is decent (66.36).
      • 98.59% (= 1,684/1,708) of multiword are generated from the VPC model.
      • WordNet multiword tagging has a high precision of 91.33, recall of 72.78, and F1 of 80.01.

      • Relevant: 1243 of the 1708 multiword terms are in the WordNet (TP) with 465 are not in the WordNet (FN).
      • Irrelevant: 748 of the not multiword term candidadtes (866) from the Lexicon are not multiwords in the WordNet (TN) and 118 not multiword term candidate are in the WordNet (FP).