TCMeta: a multilingual dataset of COVID tweets for relation-level metaphor analysis

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

Abstract

The COVID pandemic spurred the use of various metaphors, some very common and universal, others depending on the language, country and culture. The use of metaphors by the general public, especially in languages other than English, has not yet been sufficiently investigated, one of the reasons being the lack of resources and automatic tools for metaphor analysis. To fill this gap, we introduce TCMeta, a dataset of tweets annotated for metaphors around COVID-19, in two languages from ten different countries. The dataset contains metaphoric phrases covering four source domains. Furthermore, we introduce a semi-automatic methodology to annotate more than 2000 tweets in English and Slovene. To the best of our knowledge, this is the first multilingual semi-automatically compiled dataset of user-generated texts aimed at investigating metaphorical language about the pandemic. It is also the first Slovene dataset of tweets annotated for metaphors.

Original languageEnglish
JournalLanguage Resources and Evaluation
DOIs
Publication statusAccepted/In press - 2024

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • COVID-19
  • English
  • Relation-level metaphors
  • Slovene
  • Twitter

Fingerprint

Dive into the research topics of 'TCMeta: a multilingual dataset of COVID tweets for relation-level metaphor analysis'. Together they form a unique fingerprint.

Cite this