DTWFF-pitch feature and faster neural network convergence for speech recognition

Sudirman, Rubita and Salleh, Sh. Hussain and Salleh, Shaharuddin (2007) DTWFF-pitch feature and faster neural network convergence for speech recognition. Elektrika, 9 (1). pp. 9-13. ISSN 0128-4428

	PDF (Full Text) - Published Version Restricted to Repository staff only 91kB
	HTML - Published Version 162kB

Abstract

This paper presents the pre-processing of speech templates for artificial neural network (ANN). The processed features are pitch and Linear Predictive Coefficients (LPC) for input and reference templates, based on Dynamic Time Warping (DTW) algorithm. The first task is to extract pitch features using Pitch Scale Harmonic Filter algorithm. Another task is to align the input frames (test set) to the reference template (training set) using DTW fixing frame (DTW-FF) algorithm. This is a time normalization process in which it is needed for data with unequal length. By doing time normalization, the test set and the training set are adjusted to the same number of frames. Having both pitch and LPC features fixed frames, speech recognition using neural network can be performed. A high recognition rate is obtained using combined features of DTW-FF and pitch for Malay digit words of 0-9, as high as 100% is achieved. Another task included in this paper is to find the optimal global minimum of the NN surface using the conjugate gradient algorithm to replace the steepest gradient descent in the back-propagation algorithm. Results showed that conjugate gradient algorithm is able to find a better optimal global minimum.

Item Type:	Article
Uncontrolled Keywords:	artificial neural network, dynamic time warping, frame fixing, pitch, conjugate gradient method
Subjects:	T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions:	Electrical Engineering
ID Code:	8061
Deposited By:	Norshiela Buyamin
Deposited On:	25 Mar 2009 06:39
Last Modified:	28 Nov 2013 03:19

Repository Staff Only: item control page