Expanding wordnets to new languages with multilingual sense disambiguation

    Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

    10 Citations (Scopus)

    Abstract

    Princeton WordNet is one of the most important resources for natural language processing, but is only available for English. While it has been translated using the expand approach to many other languages, this is an expensive manual process. Therefore it would be beneficial to have a high-quality automatic translation approach that would support NLP techniques, which rely on WordNet in new languages. The translation of wordnets is fundamentally complex because of the need to translate all senses of a word including low frequency senses, which is very challenging for current machine translation approaches. For this reason we leverage existing translations of WordNet in other languages to identify contextual information for wordnet senses from a large set of generic parallel corpora. We evaluate our approach using 10 translated wordnets for European languages. Our experiment shows a significant improvement over translation without any contextual information. Furthermore, we evaluate how the choice of pivot languages affects performance of multilingual word sense disambiguation.

    Original languageEnglish
    Title of host publicationCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016
    Subtitle of host publicationTechnical Papers
    PublisherAssociation for Computational Linguistics, ACL Anthology
    Pages97-108
    Number of pages12
    ISBN (Print)9784879747020
    Publication statusPublished - 2016
    Event26th International Conference on Computational Linguistics, COLING 2016 - Osaka, Japan
    Duration: 11 Dec 201616 Dec 2016

    Publication series

    NameCOLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers

    Conference

    Conference26th International Conference on Computational Linguistics, COLING 2016
    Country/TerritoryJapan
    CityOsaka
    Period11/12/1616/12/16

    Fingerprint

    Dive into the research topics of 'Expanding wordnets to new languages with multilingual sense disambiguation'. Together they form a unique fingerprint.

    Cite this