Publication

Findings of the SIGTYP 2024 shared task on word embedding evaluation for ancient and historical languages

Loading...
Thumbnail Image
Identifiers
https://aclanthology.org/2024.sigtyp-1.19/
https://hdl.handle.net/10379/18666
https://doi.org/10.13025/29460
Repository DOI
Publication Date
2024-03
Type
conference paper
Downloads
Citation
Oksana Dereza, Adrian Doyle, Priya Rani, Atul Kr. Ojha, Pádraic Moran, and John McCrae. 2024. Findings of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages. In Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, pages 160–172, St. Julian's, Malta. Association for Computational Linguistics.
Abstract
This paper discusses the organisation and findings of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages. The shared task was split into the constrained and unconstrained tracks and involved solving either 3 or 5 problems for either 13 or 16 ancient and historical languages belonging to 4 language families, and making use of 6 different scripts. There were 14 registrations in total, of which 3 teams submitted to each track. Out of these 6 submissions, 2 systems were successful in the constrained setting and another 2 in the unconstrained setting, and 4 system description papers were submitted by different teams. The best average result for morphological feature prediction was about 96%, while the best average results for POS-tagging and lemmatisation were 96% and 94% respectively. At the word level, the winning team could not achieve a higher average accuracy across all 16 languages than 5.95%, which demonstrates the difficulty of this problem. At the character level, the best average result over 16 languages 55.62%
Publisher
Association for Computational Linguistics
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International