Improving Long Context Document-Level Machine Translation

Christian Herold, Hermann Ney


Abstract
Document-level context for neural machine translation (NMT) is crucial to improve the translation consistency and cohesion, the translation of ambiguous inputs, as well as several other linguistic phenomena. Many works have been published on the topic of document-level NMT, but most restrict the system to only local context, typically including just the one or two preceding sentences as additional information. This might be enough to resolve some ambiguous inputs, but it is probably not sufficient to capture some document-level information like the topic or style of a conversation. When increasing the context size beyond just the local context, there are two challenges: (i) the memory usage increases exponentially (ii) the translation performance starts to degrade. We argue that the widely-used attention mechanism is responsible for both issues. Therefore, we propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption. For evaluation, we utilize targeted test sets in combination with novel evaluation techniques to analyze the translations in regards to specific discourse-related phenomena. We find that our approach is a good compromise between sentence-level NMT vs attending to the full context, especially in low resource scenarios.
Anthology ID:
2023.codi-1.15
Volume:
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Michael Strube, Chloe Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaiciga, Amir Zeldes
Venue:
CODI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
112–125
Language:
URL:
https://aclanthology.org/2023.codi-1.15
DOI:
10.18653/v1/2023.codi-1.15
Bibkey:
Cite (ACL):
Christian Herold and Hermann Ney. 2023. Improving Long Context Document-Level Machine Translation. In Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023), pages 112–125, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Improving Long Context Document-Level Machine Translation (Herold & Ney, CODI 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.codi-1.15.pdf
Video:
 https://aclanthology.org/2023.codi-1.15.mp4