Train-O-Matic
Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data
Tell Me More

About

We present Train-O-Matic, a language-independent method for generating millions of sense-annotated training instances for virtually all meanings of words ina language’s vocabulary.

Abstract

Annotating large numbers of sentences with senses is the heaviest requirement of current Word Sense Disambiguation. We present Train-O-Matic, a language-independent method for generating millions of sense-annotated training instances for virtually all meanings of words in a language’s vocabulary. The approach is fully automatic: no human intervention is required and the only type of human knowledge used is a WordNet-likeresource. Train-O-Matic achieves consistently state-of-the-art performance across gold standard datasets and languages, while at the same time removing the burden of manual annotation.

Reference

Tommaso Pasini and Roberto Navigli
Train-O-Matic: Large-Scale Supervised Word Sense Disambiguation in Multiple Languages without Manual Training Data
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP), Copenhagen, Denmark, 7-11 September 2017.

Authors

Tommaso Pasini

PhD Student @ Sapienza

pasini [at] di.uniroma1.it

Roberto Navigli

Full Professor @ Sapienza

navigli [at] di.uniroma1.it