Abstract
Translated terminology for severely under-resourced languages is a vital tool for aid workers working in
humanitarian crises. However there are generally no lexical resources that can be used for this purpose. Translators
without Borders (TWB) is a non-profit whose goal is to help get vital information, including developing lexical
resources for aid workers. In order to help with the resource construction, TWB has worked with the ADAPT
Centre to develop tools to help with the development of their resources for crisis response. In particular, we have
enriched these resources by linking with open lexical resources such as WordNet and Wikidata as well as the
derivation of a novel extended corpus. In particular, this work has focused on the development of resources for
languages useful for aid workers working with Rohingya refugees, namely, Rohingya, Chittagonian, Bengali and
Burmese. These languages are all under-resourced and for Rohingya and Chittagonian there are only very limited
major lexical resources available. For these languages, we have constructed some of the first corpora resources that
will allow automatic construction of lexical resources. We have also used the Naisc tool for monolingual dictionary
linking in order to connect the existing English parts of the lexical resources with information from WordNet
and Wikidata and this has provided a wealth of extra information including images, alternative definitions,
translations (in Bengali, Burmese and other languages) as well as many related terms that may guide TWB
linguists and terminologists in the process of extending their resources. We have presented these results in an
interface allowing the lexicographers to browse through the results extracted from the external resources and
select those that they wish to include in their resource. We present results on the quality of the linking inferred by the Naisc system as well as qualitative analysis of the effectiveness of the tool in the development of the TWB glossaries
| Original language | English (Ireland) |
|---|---|
| Title of host publication | Electronic lexicography in the 21st century (eLex 2021): Post-editing lexicography Proceedings of the eLex 2021 conference |
| Place of Publication | Online |
| Publication status | Published - 1 Jul 2021 |
Authors (Note for portal: view the doc link for the full list of authors)
- Authors
- McCrae, John P.; Ojha, Atul Kr.; Chakravarthi, Bharathi Raja; Kelly, Ian; Buffini, Patricia; Tang, Grace; Paquin, Eric and Locria, Manuel