Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning

Wei-rui Chen, Muhammad Abdul-mageed


Abstract
Machine translation (MT) involving Indigenous languages, including endangered ones, is challenging primarily due to lack of sufficient parallel data. We describe an approach exploiting bilingual and multilingual pretrained MT models in a transfer learning setting to translate from Spanish into ten South American Indigenous languages. Our models set new SOTA on five out of the ten language pairs we consider, even doubling performance on one of these five pairs. Unlike previous SOTA that perform data augmentation to enlarge the train sets, we retain the low-resource setting to test the effectiveness of our models under such a constraint. In spite of the rarity of linguistic information available about the Indigenous languages, we offer a number of quantitative and qualitative analyses (e.g., as to morphology, tokenization, and orthography) to contextualize our results.
Anthology ID:
2023.loresmt-1.6
Volume:
Proceedings of the Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venue:
LoResMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
73–85
Language:
URL:
https://aclanthology.org/2023.loresmt-1.6
DOI:
10.18653/v1/2023.loresmt-1.6
Bibkey:
Cite (ACL):
Wei-rui Chen and Muhammad Abdul-mageed. 2023. Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning. In Proceedings of the Sixth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2023), pages 73–85, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning (Chen & Abdul-mageed, LoResMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.loresmt-1.6.pdf
Video:
 https://aclanthology.org/2023.loresmt-1.6.mp4