LEX Source Model - Lexicon Records with Negations
I. Introduction
Negation plays a role to express the opposite meaning in antonyms according to the boundedness hypothesis [2006 Paradis]. Words that have negative tags in the lexicon are used as root to retrieve aPair candidates, such as “unlike”, “isn’t”, “neither” and “couldn’t” in aPairs of [like|unlike|prep], [is|isn’t|aux], [either|neither|det] and [could|couldn’t|modal], respectively.
II. Design
Lexical records with negative or broad_negative tags are used to generate antonym candidates. APair candidates are retrieved from the 7 POSs of [adv|pron|aux|modals|prep|det|conj] that have negative tags as discussed below. POSs of [noun|adj|verb|compl] do not have negative tags in Lexicon.
Please see design documents for more details.
- Adverbs (adv):
- true negative/strict negation (negative): never, no, not, nowise
- broadly negative (broad_negative): hardly, seldom, rarely, even, either, little, scarcely, slightly, barely, seldomly.
- Pronoun (pron):
- type=indef(neg): none, nobody, nothing, noone, neither, naught
- Auxiliary (aux) - negative:
- variant=isn't;pres(thr_sing):negative
- variant=aren't;pres(fst_plur,second,thr_plur):negative
- variant=don't;pres(fst_sing,fst_plur,second,thr_plur):negative
- variant=haven't;pres(fst_sing,fst_plur,second,thr_plur):negative
- …
- Modal (modal) - negative:
- variant=mayn’t;pres:negative
- variant=mightn’t;past:negative
- variant=mustn’t;pres:negative
- variant=couldn’t;past:negative
- variant=cannot;pres:negative
- variant=can’t;pres:negative
- Preposition (prep):
- true negative/strict negation (negative): without
- broadly negative (broad_negative): unlikely (not used as negation cue word)
- Determiner (det):
- true negative/strict negation (negative): no, neither, nary a, nary an
- Conjunction (conj):
- true negative/strict negation (negative): neither, nor
III. Implementation
Java source codes are implemented in the directory of Lexicon:
- GenAntCandFromLexicon.java
Algorithm:
- Go through lexRecords and convert them into lexRecord Objects
- check if the POS has negation
- put the base form of lexRecord on the Ant-2 (because of negation)
- Use B2 or BN2 for negation field.
- put ANT_TBD, EUI_TBD, CANON_TBD, TYPE_TBD, and DOMAIN_TBD for fields of ant-1, EUI-1, canon, type, and domain.
-
Output:
These candidates are outputted in the standard 10 field format and sent to linguists for tagging and further processing.
Ant-1 | EUI-1 | Ant-2 | EUI-2 | POS | Canon | Type | Negation | Domain | Source
|
---|
ANT_TBD | EUI_TBD | ant-2 | EUI-2 | pos | CANON_TBD | TYPE_TBD | N2|BN2 | DOMAIN_TBD | LEX
|
Notes:
- Linguists should fill all XXX_TBD fields.
- APair candidates from LEX source is rather static. It only increases when there are new lexRecord with negation tags.