Self-selection bias of similarity metrics in translation memory evaluation

  • Friedel Wolff
  • , Laurette Pretorius
  • , Loïc Dugast
  • , Paul Buitelaar

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

1 Citation (Scopus)

Abstract

A translation memory system attempts to retrieve useful suggestions from previous translations to assist a translator in a new translation task. While assisting the translator with a specific segment, some similarity metric is usually employed to select the best matches from previously translated segments to present to a translator. Automated methods for evaluating a translation memory system usually use reference translations and some similarity metric. Such evaluation methods might be expected to assist in choosing between competing systems. No single evaluation method has gained widespread use; additionally the similarity metric used in each of these methods is not standardised either. This paper investigates the consequences of substituting the similarity metric in such an evaluation method, and finds that the similarity metrics exhibit a strong bias for the system using the same metric for retrieval. Consequently the choice of similarity metric in the evaluation of translation memory systems should be carefully reconsidered.

Original languageEnglish
Pages (from-to)129-144
Number of pages16
JournalMachine Translation
Volume30
Issue number3-4
DOIs
Publication statusPublished - 1 Dec 2016
Externally publishedYes

Keywords

  • Bias
  • Evaluation
  • Text similarity
  • Translation memory

Fingerprint

Dive into the research topics of 'Self-selection bias of similarity metrics in translation memory evaluation'. Together they form a unique fingerprint.

Cite this