Missed opportunities in translation memory matching

    Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

    2 Citations (Scopus)

    Abstract

    A translation memory system stores a data set of source-target pairs of translations. It attempts to respond to a query in the source language with a useful target text from the data set to assist a human translator. Such systems estimate the usefulness of a target text suggestion according to the similarity of its associated source text to the source text query. This study analyses two data sets in two language pairs each to find highly similar target texts, which would be useful mutual suggestions. We further investigate which of these useful suggestions can not be selected through source text similarity, and we do a thorough analysis of these cases to categorise and quantify them. This analysis provides insight into areas where the recall of translation memory systems can be improved. Specifically, source texts with an omission, and semantically very similar source texts are some of the more frequent cases with useful target text suggestions that are not selected with the baseline approach of simple edit distance between the source texts.

    Original languageEnglish
    Title of host publicationProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014
    EditorsNicoletta Calzolari, Khalid Choukri, Sara Goggi, Thierry Declerck, Joseph Mariani, Bente Maegaard, Asuncion Moreno, Jan Odijk, Helene Mazo, Stelios Piperidis, Hrafn Loftsson
    PublisherEuropean Language Resources Association (ELRA)
    Pages4401-4406
    Number of pages6
    ISBN (Electronic)9782951740884
    Publication statusPublished - 2014
    Event9th International Conference on Language Resources and Evaluation, LREC 2014 - Reykjavik, Iceland
    Duration: 26 May 201431 May 2014

    Publication series

    NameProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

    Conference

    Conference9th International Conference on Language Resources and Evaluation, LREC 2014
    Country/TerritoryIceland
    CityReykjavik
    Period26/05/1431/05/14

    Keywords

    • Edit distance
    • Text similarity
    • Translation memory

    Fingerprint

    Dive into the research topics of 'Missed opportunities in translation memory matching'. Together they form a unique fingerprint.

    Cite this