N. Ruiz


2012

pdf bib
FBK’s machine translation systems for IWSLT 2012’s TED lectures
N. Ruiz | A. Bisazza | R. Cattoni | M. Federico
Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper reports on FBK’s Machine Translation (MT) submissions at the IWSLT 2012 Evaluation on the TED talk translation tasks. We participated in the English-French and the Arabic-, Dutch-, German-, and Turkish-English translation tasks. Several improvements are reported over our last year baselines. In addition to using fill-up combinations of phrase-tables for domain adaptation, we explore the use of corpora filtering based on cross-entropy to produce concise and accurate translation and language models. We describe challenges encountered in under-resourced languages (Turkish) and language-specific preprocessing needs.

2011

pdf bib
FBK@IWSLT 2011
N. Ruiz | A. Bisazza | F. Brugnara | D. Falavigna | D. Giuliani | S. Jaber | R. Gretter | M. Federico
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper reports on the participation of FBK at the IWSLT 2011 Evaluation: namely in the English ASR track, the Arabic-English MT track and the English-French MT and SLT tracks. Our ASR system features acoustic models trained on a portion of the TED talk recordings that was automatically selected according to the fidelity of the provided transcriptions. Three decoding steps are performed interleaved by acoustic feature normalization and acoustic model adaptation. Concerning the MT and SLT systems, besides language specific pre-processing and the automatic introduction of punctuation in the ASR output, two major improvements are reported over our last year baselines. First, we applied a fill-up method for phrase-table adaptation; second, we explored the use of hybrid class-based language models to better capture the language style of public speeches.