Universiti Teknologi Malaysia Institutional Repository

Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network

Adam, Tarmizi (2014) Isolated English alphabet speech recognition using wavelet cepstral coefficients and neural network. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computing.

[img]
Preview
PDF
236kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

Speech recognition has many applications in various fields. One of the most important phase in speech recognition is feature extraction. In feature extraction relevant important information from the speech signal are extracted. However, two important issues that affect feature extraction are noise robustness and high feature dimension. Existing feature extraction which uses fixed windows processing and spectral analysis methods like Mel-Frequency Cepstral Coefficient (MFCC) could not cater robustness and high feature dimension problems. This research proposes the usage of Discrete Wavelet Transform (DWT) to replace Discrete Fourier Transform (DFT) for calculating the cepstrum coefficients to produce a newly proposed Wavelet Cepstral Coefficient Wavelet Cepstral Coefficient (WCC). The DWT is used in order to gain the advantages of the wavelet in analyzing non stationary signals. The WCC is computed in a frame by frame manner. Each speech frame is decomposed using the DWT and the log energy of its coefficients is taken. The final stage of the WCC computation is done by taking the Discrete Cosine Transform (DCT) of these log energies to form the WCC. The WCC are then fed into a Neural Network (NN) for classification. In order to test the proposed WCC a series of experiments were conducted on TI-ALPHA dataset to compare its performance with the MFCC. The experiments were conducted under several noise levels using Additive White Gaussian Noise (AWGN) and number of coefficients for speaker dependent and independent tasks. From the results, it is shown that the WCC has the advantage of withstanding noisy conditions better than MFCC especially under small number of features for both speaker dependent and independent tasks. The best result tested under noisy condition of 25 dB shows that 30 WCC coefficients using Daubechies 12 achieved 71.79% recognition rate in comparison to only 37.62% using MFCC under the same constraint. The main contribution of this research is the development of the WCC features which performs better than the MFCC under noisy signals and reduced number of feature coefficients.

Item Type:Thesis (Masters)
Additional Information:Thesis (Sarjana (Sains Komputer)) - Universiti Teknologi Malaysia, 2014; Supervisors : Dr. Md. Sah Salam, Dr. Teddy Surya Gunawan
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:78047
Deposited By: Fazli Masari
Deposited On:23 Jul 2018 05:33
Last Modified:23 Jul 2018 05:33

Repository Staff Only: item control page