Sudirman, Rubita and Salleh, Sh-Hussain and Salleh, Shaharuddin (2006) Local DTW Coefficients and Pitch Feature for Back-Propagation NN Digits Recognition. In: IASTED International Conference on Networks and Communication System, 29-31 March 2006, Chiang Mai, Thailand.
Official URL: http://www.actapress.com/PaperInfo.aspx?PaperID=23...
This paper presents a method to extract existing speech features in dynamic time warping path which originally was derived from LPC. This extracted feature coefficients represent as an input for neural network back-propagation. The coefficients are normalized with respect to the reference pattern according to the average number of frames over the samples recorded. This is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class. The new feature processing used the famous frame matching technique, which is Dynamic Time Warping (DTW) to fix the input size to a fix number of input vectors. The LPC features vectors are aligned between the source frames to the template using our DTW frame fixing (DTW-FF) algorithm. By doing frame fixing, the source and template frames are adjusted so that they have the same number of frames. The speech recognition is performed using the back-propagation neural network (BPNN) algorithm to enhance the recognition performance. The results compare DTW using LPC coefficients to BPNN with DTW-FF coefficients. Added pitch feature investigate the improvement made to the previous experiment using different number of hidden neurons.
|Item Type:||Conference or Workshop Item (Paper)|
|Uncontrolled Keywords:||dynamic time warping, normalization, linear predictive coding, pitch feature, back-propagation neural network|
|Subjects:||T Technology > TK Electrical engineering. Electronics Nuclear engineering|
|Deposited By:||Dr Zaharuddin Mohamed|
|Deposited On:||20 Mar 2007 07:21|
|Last Modified:||01 Jun 2010 03:00|
Repository Staff Only: item control page