Publication

Leveraging rule-based machine translation knowledge for under-resourced neural machine translation models

Torregrosa, Daniel
Pasricha, Nivranshu
Chakravarth, Bharathi Raja
Masoud, Maraim
Alonso, Juan
Casas, Noe
Arcan, Mihael
Citation
Torregrosa, Daniel , Pasricha, Nivranshu, Chakravarth, Bharathi Raja, Masoud, Maraim , Alonso, Juan , Casas, Noe , & Arcan, Mihael (2019). Leveraging rule-based machine translation knowledge for under-resourced neural machine translation models. Paper presented at the Machine Translation Summit, Dublin, Ireland, 19-23 August, doi:10.13025/prgj-bk28
Abstract
Rule-based machine translation is a machine translation paradigm where linguistic knowledge is encoded by an expert in the form of rules that translate from source to target language. While this approach grants total control over the output of the system, the cost of formalising the needed linguistic knowledge is much higher than training a corpus-based system, where a machine learning approach is used to automatically learn to translate from examples. In this paper, we describe different approaches to leverage the information contained in rulebased machine translation systems to improve a corpus-based one, namely, a neural machine translation model, with a focus on a low-resource scenario. Our results suggest that adding morphological information to the source language is as effective as using subword units in this particular setting.
Publisher
NUI Galway
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivs 3.0 Ireland