Sense Distributions

Word Sense Distributions Leargning

We present EnDi and DaD, two knowledge-based methods that learn the word sense distribution given an input corpus of sentences in possibly any language.

Abstract

Knowing the correct distribution of senses within a corpus can potentially boost the performance of Word Sense Disambiguation (WSD) systems by many points. We present two fully automatic and languageindependent methods for computing the distribution of senses given a raw corpus of sentences. Intrinsic and extrinsic evaluations show that our meth- ods outperform the current state of the art in sense distribution learning and the strongest baselines for the most frequent sense in multiple languages and on domain-specific test sets.