TCMeta: a multilingual dataset of COVID tweets for relation-level metaphor analysis
Brglez, Mojca ; Zayed, Omnia ; Buitelaar, Paul
Brglez, Mojca
Zayed, Omnia
Buitelaar, Paul
Loading...
Files
Loading...
s10579-024-09725-z.pdf
Adobe PDF, 1.18 MB
Publication Date
2025-03-30
Type
journal article
Downloads
Citation
Brglez, Mojca, Zayed, Omnia, & Buitelaar, Paul. (2025). TCMeta: a multilingual dataset of COVID tweets for relation-level metaphor analysis. Language Resources and Evaluation, 59(1), 437-475. https://doi.org/10.1007/s10579-024-09725-z
Abstract
The COVID pandemic spurred the use of various metaphors, some very common and universal, others depending on the language, country and culture. The use of metaphors by the general public, especially in languages other than English, has not yet been sufficiently investigated, one of the reasons being the lack of resources and automatic tools for metaphor analysis. To fill this gap, we introduce TCMeta, a dataset of tweets annotated for metaphors around COVID-19, in two languages from ten different countries. The dataset contains metaphoric phrases covering four source domains. Furthermore, we introduce a semi-automatic methodology to annotate more than 2000 tweets in English and Slovene. To the best of our knowledge, this is the first multilingual semi-automatically compiled dataset of user-generated texts aimed at investigating metaphorical language about the pandemic. It is also the first Slovene dataset of tweets annotated for metaphors.
Publisher
Springer
Publisher DOI
Rights
CC BY