Exploring the impact of noise in low-resource ASR for Tamil

Vigneshwar Lakshminarayanan, Emily Prud’hommeaux


Abstract
The use of deep learning algorithms has resulted in significant progress in automatic speech recognition (ASR). Robust high-accuracy ASR models typically require thousands or tens of thousands of hours of speech data, but even the strongest models tend fail under noisy conditions. Unsurprisingly, the impact of noise on accuracy is more drastic in low-resource settings. In this paper, we investigate the impact of noise on ASR in a low-resource setting. We explore novel methods for developing noise-robust ASR models using a a small dataset for Tamil, a widely-spoken but under-resourced Dravidian languages. We add various noises to the audio data to determine the impact of different kinds of noise (e.g., punctuated vs. constant, man-made vs natural) We also explore the relationship between different data augmentation methods are better suited to handling different types of noise. Our results show that all noises, regardless of the type, had an impact on ASR performance, and that upgrading the architecture alone could not mitigate the impact of noise. SpecAugment, the most common data augmentation method, was not as helpful as raw data augmentation, in which noise is explicitly added to the training data. Raw data augmentation enhances ASR performance on both clean data and noise-mixed data.
Anthology ID:
2024.dravidianlangtech-1.5
Volume:
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
March
Year:
2024
Address:
St. Julian's, Malta
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Rajeswari Nadarajan, Manikandan Ravikiran
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–34
Language:
URL:
https://aclanthology.org/2024.dravidianlangtech-1.5
DOI:
Bibkey:
Cite (ACL):
Vigneshwar Lakshminarayanan and Emily Prud’hommeaux. 2024. Exploring the impact of noise in low-resource ASR for Tamil. In Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 30–34, St. Julian's, Malta. Association for Computational Linguistics.
Cite (Informal):
Exploring the impact of noise in low-resource ASR for Tamil (Lakshminarayanan & Prud’hommeaux, DravidianLangTech-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.dravidianlangtech-1.5.pdf