Universiti Teknologi Malaysia Institutional Repository

Gated recurrent unit for low power wake-word detection

Chin, Jian Qee (2021) Gated recurrent unit for low power wake-word detection. Masters thesis, Universiti Teknologi Malaysia.

[img]
Preview
PDF
313kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

Neural networks made some of the latest state of the art technologies such as speech recognition, language translation and stock prediction possible. Among them, speech recognition is a very popular application which is growing rapidly. It is widely used in applications such as mobile phones and Amazon smart speakers in order to enhance user experience. However, neural networks used for speech recognition require a large amount of computations, especially if it is in always-on state. This made it infeasible to be implemented in battery-powered edge devices such as wearables, sensors, and internet-of-things devices, as the battery life will not last long enough to provide a good user experience. To address this issue, this work enhances the recurrent neural network (RNN), or specifically, Gated Recurrent Unit (GRU) for the task ofwake-word detection. Awake-word detector is always powered-on, listening to a specific phrase, the wake-word. Therefore, the power consumption must be low enough to enable long battery usage – a feature that is sought by many end-consumers. This work proposes four modifications to the existing GRU architecture. First, the reset gate is removed as there are researches which implies that it is not needed in application such as speech recognition. Second, the activation function is changed from the conventional sigmoid/hyperbolic tangent function to softsign function. Third, weight quantization is carried out to reduce the memory footprint and speed up calculations. Fourth, fixed point arithmetic is used instead of floating point format. With the above enhancements in architecture, memory and power consumption is reduced while keeping the impact to the accuracy minimal. Furthermore, it is possible to embed this new neural network model to battery-powered edge devices such as wearables. In summary, this work explores the possibility of implementing an improved GRU architecture in batterypowered edge devices to enable low-power usage for speech recognition purpose.

Item Type:Thesis (Masters)
Uncontrolled Keywords:Amazon smart speakers, recurrent neural network (RNN), architecture
Subjects:T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions:Electrical Engineering
ID Code:96438
Deposited By: Narimah Nawil
Deposited On:24 Jul 2022 09:57
Last Modified:24 Jul 2022 09:57

Repository Staff Only: item control page