Distilling Estonian Text Domains for Production-Oriented Machine Translation

Elizaveta Korotkova; Mark Fishel

Distilling Estonian Text Domains for Production-Oriented Machine Translation

Abstract

This paper explores knowledge distillation for multi-domain neural machine translation (NMT). We focus on the Estonian-English translation direction and experiment with distilling the knowledge of multiple domain-specific teacher models into a single student model that is tiny and efficient. Our experiments use a large parallel dataset of 18 million sentence pairs, consisting of 10 corpora, divided into 6 domain groups based on source similarity, and incorporate forward-translated monolingual data. Results show that tiny student models can cope with multiple domains even in case of large corpora, with different approaches benefiting frequent and low-resource domains.

Anthology ID:: 2023.nodalida-1.78
Volume:: Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:: May
Year:: 2023
Address:: Tórshavn, Faroe Islands
Editors:: Tanel Alumäe, Mark Fishel
Venue:: NoDaLiDa
SIG:
Publisher:: University of Tartu Library
Note:
Pages:: 772–781
Language:
URL:: https://aclanthology.org/2023.nodalida-1.78
DOI:
Bibkey:
Cite (ACL):: Elizaveta Korotkova and Mark Fishel. 2023. Distilling Estonian Text Domains for Production-Oriented Machine Translation. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 772–781, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):: Distilling Estonian Text Domains for Production-Oriented Machine Translation (Korotkova & Fishel, NoDaLiDa 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.nodalida-1.78.pdf

PDF Cite Search