CHATR the Corpus; a 20-year-old archive of Concatenative Speech Synthesis

Nick Campbell


Abstract
This paper reports the preservation of an old speech synthesis website as a corpus. CHATR was a revolutionary technique developed in the mid nineties for concatenative speech synthesis. The method has since become the standard for high quality speech output by computer although much of the current research is devoted to parametric or hybrid methods that employ smaller amounts of data and can be more easily tunable to individual voices. The system was first reported in 1994 and the website was functional in 1996. The ATR labs where this system was invented no longer exist, but the website has been preserved as a corpus containing 1537 samples of synthesised speech from that period (118 MB in aiff format) in 211 pages under various finely interrelated themes The corpus can be accessed from www.speech-data.jp as well as www.tcd-fastnet.com, where the original code and samples are now being maintained.
Anthology ID:
L16-1548
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3436–3439
Language:
URL:
https://aclanthology.org/L16-1548
DOI:
Bibkey:
Cite (ACL):
Nick Campbell. 2016. CHATR the Corpus; a 20-year-old archive of Concatenative Speech Synthesis. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 3436–3439, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
CHATR the Corpus; a 20-year-old archive of Concatenative Speech Synthesis (Campbell, LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1548.pdf