Skip to main navigation Skip to search Skip to main content

Language related issues for machine translation between closely related South Slavic languages

  • Maja Popovic
  • , MIHAEL ARCAN
  • , Filip Klubicka

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

Abstract

Machine translation between closely related languages is less challenging and exhibits a smaller number of translation errors than translation between distant languages, but there are still obstacles which should be addressed in order to improve such systems. This work explores the obstacles for machine translation systems between closely related South Slavic languages, namely Croatian, Serbian and Slovenian. Statistical systems for all language pairs and translation directions are trained using parallel texts from different domains, however mainly on spoken language i.e. subtitles. For translation between Serbian and Croatian, a rule-based system is also explored. It is shown that for all language pairs and for both translation systems, the main obstacles are the differences between syntactic properties.
Original languageEnglish (Ireland)
Title of host publicationThird Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial-3)
PublisherThe COLING 2016 Organizing Committee
DOIs
Publication statusPublished - 1 Jan 2016

Fingerprint

Dive into the research topics of 'Language related issues for machine translation between closely related South Slavic languages'. Together they form a unique fingerprint.

Cite this