On the Copying Problem of Unsupervised NMT: A Training Schedule with a Language Discriminator Loss

Yihong Liu, Alexandra Chronopoulou, Hinrich Schütze, Alexander Fraser


Abstract
Although unsupervised neural machine translation (UNMT) has achieved success in many language pairs, the copying problem, i.e., directly copying some parts of the input sentence as the translation, is common among distant language pairs, especially when low-resource languages are involved. We find this issue is closely related to an unexpected copying behavior during online back-translation (BT). In this work, we propose a simple but effective training schedule that incorporates a language discriminator loss. The loss imposes constraints on the intermediate translation so that the translation is in the desired language. By conducting extensive experiments on different language pairs, including similar and distant, high and low-resource languages, we find that our method alleviates the copying problem, thus improving the translation performance on low-resource languages.
Anthology ID:
2023.iwslt-1.48
Volume:
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Marine Carpuat
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
491–502
Language:
URL:
https://aclanthology.org/2023.iwslt-1.48
DOI:
10.18653/v1/2023.iwslt-1.48
Bibkey:
Cite (ACL):
Yihong Liu, Alexandra Chronopoulou, Hinrich Schütze, and Alexander Fraser. 2023. On the Copying Problem of Unsupervised NMT: A Training Schedule with a Language Discriminator Loss. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 491–502, Toronto, Canada (in-person and online). Association for Computational Linguistics.
Cite (Informal):
On the Copying Problem of Unsupervised NMT: A Training Schedule with a Language Discriminator Loss (Liu et al., IWSLT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.iwslt-1.48.pdf