Bistra Andreeva


2014

pdf bib
A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence
Bistra Andreeva | William Barry | Jacques Koreman
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The present article describes a corpus which was collected for the cross-language comparison of prominence. In the data analysis, the acoustic-phonetic properties of words spoken with two different levels of accentuation (de-accented and nuclear accented in non-contrastive narrow-focus) are examined in question-answer elicited sentences and iterative imitations (on the syllable ‘da’) produced by Bulgarian, Russian, French, German and Norwegian speakers (3 male and 3 female per language). Normalized parameter values allow a comparison of the properties employed in differentiating the two levels of accentuation. Across the five languages there are systematic differences in the degree to which duration, f0, intensity and spectral vowel definition change with changing prominence under different focus conditions. The link with phonological differences between the languages is discussed.

pdf bib
Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
Camille Fauth | Anne Bonneau | Frank Zimmerer | Juergen Trouvain | Bistra Andreeva | Vincent Colotte | Dominique Fohr | Denis Jouvet | Jeanin Jügler | Yves Laprie | Odile Mella | Bernd Möbius
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present the design of a corpus of native and non-native speech for the language pair French-German, with a special emphasis on phonetic and prosodic aspects. To our knowledge there is no suitable corpus, in terms of size and coverage, currently available for the target language pair. To select the target L1-L2 interference phenomena we prepare a small preliminary corpus (corpus1), which is analyzed for coverage and cross-checked jointly by French and German experts. Based on this analysis, target phenomena on the phonetic and phonological level are selected on the basis of the expected degree of deviation from the native performance and the frequency of occurrence. 14 speakers performed both L2 (either French or German) and L1 material (either German or French). This allowed us to test, recordings duration, recordings material, the performance of our automatic aligner software. Then, we built corpus2 taking into account what we learned about corpus1. The aims are the same but we adapted speech material to avoid too long recording sessions. 100 speakers will be recorded. The corpus (corpus1 and corpus2) will be prepared as a searchable database, available for the scientific community after completion of the project.