NL2KR

nl2kr-Lnl2kr-T

 

Shown above is the architecture of the NL2KR system. It has two sub-parts which depends on each other (1) NL2KR-L for learning and (2) NL2KR-T for translating.

 

The NL2KR-L takes an initial lexicon consisting of some words and their meanings in terms of lambda-calculus expressions, a set of training sentences and their target formal representations as input. It then uses a Combinatorial Categorical Grammar (CCG) parser to construct the parse trees. Next, the learning sub-part of the system uses Inverse-λ and Generalization algorithms to learn meanings of newly encountered words, which are not present in the initial lexicon, and adds them to the lexicon. A parameter learning method is then used to estimate a weight for each lexicon entry (word, its syntactic category and meaning) such that the joint probability of the sentences in the training set getting translated to their given formal representation is maximized. The result of NL2KR-L is the final lexicon, which contains all the words, their meanings and their weights.

 

Once the training component finishes its job, the translation component (NL2KR-T) uses this updated lexicon and translates sentences using the CCG parser. Since words can have multiple meanings and their associated λ-calculus expressions, weights assigned to each lexical entry in the lexicon helps in deciding more likely meaning of a word in the context of a sentence.