Compact Residual Learning with Frequency-Based Non-Square Kernels for Small Footprint Keyword Spotting

Muhammad Abulaish; Rahul Gulia

Compact Residual Learning with Frequency-Based Non-Square Kernels for Small Footprint Keyword Spotting

Abstract

Enabling voice assistants on small embedded devices requires a keyword spotter with a smaller model size and adequate accuracy. It becomes difficult to achieve a reasonable trade-off between a small footprint and high accuracy. Recent studies have demonstrated that convolution neural networks are also effective in the audio domain. In this paper, taking into account the nature of spectrograms, we propose a compact ResNet architecture that uses frequency-based non-square kernels to extract the maximum number of timbral features for keyword spotting. The proposed architecture is approximately three-and-a-half times smaller than a comparable architecture with conventional square kernels. On the Google’s speech command dataset v1, it outperforms both Google’s convolution neural networks and the equivalent ResNet architecture with square kernels. By implementing non-square kernels for spectrogram-related data, we can achieve a significant increase in accuracy with relatively few parameters, as compared to the conventional square kernels that are the default choice for every problem.

Anthology ID:: 2022.icon-main.39
Volume:: Proceedings of the 19th International Conference on Natural Language Processing (ICON)
Month:: December
Year:: 2022
Address:: New Delhi, India
Editors:: Md. Shad Akhtar, Tanmoy Chakraborty
Venue:: ICON
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 328–336
Language:
URL:: https://aclanthology.org/2022.icon-main.39
DOI:
Bibkey:
Cite (ACL):: Muhammad Abulaish and Rahul Gulia. 2022. Compact Residual Learning with Frequency-Based Non-Square Kernels for Small Footprint Keyword Spotting. In Proceedings of the 19th International Conference on Natural Language Processing (ICON), pages 328–336, New Delhi, India. Association for Computational Linguistics.
Cite (Informal):: Compact Residual Learning with Frequency-Based Non-Square Kernels for Small Footprint Keyword Spotting (Abulaish & Gulia, ICON 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.icon-main.39.pdf

PDF Cite Search