Universiti Teknologi Malaysia Institutional Repository

Insertion reduction in speech segmentation using neural network

Salam, M-S and Mohamad, Dzulkifli and Salleh, S-H (2008) Insertion reduction in speech segmentation using neural network. In: Proceedings - International Symposium on Information Technology 2008, ITSim. Institute of Electrical and Electronics Engineers, New York, pp. 2057-2063. ISBN 978-142442328-6

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1109/ITSIM.2008.4632062

Abstract

Statistical approach with non-fixed overlapping window size is able to make good identification of discontinuity in speech signal without further knowledge upon the phonetic sequence. This however, leads to increase number of insertion and thus increase confusion in recognition. This paper present a fusion between statistical and connectionist approach namely divergence algorithm and MLP neural network to improved segmentation by reducing insertions. The experiment conducted on Malay semi-spontaneous connected digit in classroom environment. The digit strings were manually segmented and trained using neural network with three set of data. The first training set trained without silence pattern, the second include silence while the last set introduced both silence and false pattern in the training. The experimental result on digit string segmentation shows number of insertion reduction of more than 5 times in comparison using divergence alone with increment of accuracy up to 40%.. The drawback however, the number of omission also increases to more than 10 times. Nevertheless, match segmentation rate still above 85%.

Item Type:Book Section
Additional Information:ISBN: 978-142442328-6; International Symposium on Information Technology 2008, ITSim; Kuala Lumpur; 26 August 2008 through 29 August 2008
Uncontrolled Keywords:information technology, neural networks, school buildings, speech classroom environments, connected digits, connectionist approaches, MLP neural networks, overlapping windows, speech segmentations, speech signals, statistical approaches, training sets, image classification
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System
ID Code:12589
Deposited By: Liza Porijo
Deposited On:14 Jun 2011 08:18
Last Modified:14 Jun 2011 08:18

Repository Staff Only: item control page