The Cardamom Workbench for Historical and Under-Resourced Languages

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

Abstract

This paper describes the creation of a workbench tool designed to make technologies developed throughout the lifespan of the Cardamom project easily accessible to researchers who could most benefit from them, but who may not have the technical expertise to apply bleeding edge technologies to their own datasets. The workbench provides an intuitive graphical user interface (GUI) and workflow which abstract users away from underlying technical tasks, while providing them with
a suite of powerful NLP tools developed by the Cardamom team. These include tokenisers, POS-taggers, various annotation tools, and ML models. The performance of workbench tools can be improved as text and annotations are added by users. It is envisioned that this workbench will provide a simple route to digital publication for academics in the humanities, or more specifically, for linguists working with under-resourced or historical languages, who have collected text data but are unable to make it available online as a result of financial or technical restraints. This has the added benefit of increasing the availability of high quality, annotated text data to NLP researchers, thereby providing value to both communities of researchers.
Original languageEnglish (Ireland)
Title of host publicationProceedings of the 4th Conference on Language, Data and Knowledge
EditorsSara Carvalho, Anas Fahad Khan, Ana Ostroški Anić, Blerina Spahiu, Jorge Gracia, John P. McCrae, Dagmar Gromann, Barbara Heinisch, Ana Salgado
PublisherNOVA CLUNL, Portugal
Pages109-120
Number of pages12
DOIs
Publication statusPublished - Sep 2023
EventThe 4th Conference on Language, Data and Knowledge - Vienna, Austria
Duration: 12 Sep 202315 Sep 2023
Conference number: 4
https://2023.ldk-conf.org/

Conference

ConferenceThe 4th Conference on Language, Data and Knowledge
Abbreviated titleLDK
Country/TerritoryAustria
CityVienna
Period12/09/2315/09/23
Internet address

Authors (Note for portal: view the doc link for the full list of authors)

  • Authors
  • Adrian Doyle and Theodorus Fransen and Bernardo Stearns and John Philip McCrae and Oksana Dereza and Priya Rani

Fingerprint

Dive into the research topics of 'The Cardamom Workbench for Historical and Under-Resourced Languages'. Together they form a unique fingerprint.

Cite this