Universiti Teknologi Malaysia Institutional Repository

Parallelization strategy of speaker identification system for Hybrid Modeling

Ahmad, Abdul Manan and Loh, Mun Yee (2008) Parallelization strategy of speaker identification system for Hybrid Modeling. In: Advance in Speaker Recognition Techniques and Technology. Penerbit UTM , Johor, 27-38 . ISBN 978-983-52-0629-0



Over the last decade technological advances have made speaker recognition brought a significant characteristic in forensics science and biometric identifications. Speaker recognition is a process where a person is recognized on the basis of his/her voice signals (R. C Campbel, 1997). Speaker recognition can be divided into speaker verification and speaker identification. These can furthermore be divided into text dependent and text independent systems. To date, our technology has yet to provide speaker recognition system for many application include access control system, security control for confidential information, transaction authentication and telephone banking. Pattern classification plays as a crucial part in speaker modeling component chain. The result of pattern classification will strongly affect the speaker recognition engine to decide whether to accept or reject a speaker. Many research efforts have been done in speaker recognition pattern classification. There are Dynamic Time Warping (DTW), Vector Quantization (VQ), Hidden Markov Models, Gaussian mixture model (GMM), Support Vector Machine (SVM)(Sadaoki Furui, 1997)and so forth. Building robust speaker recognition systems are often difficult because speech signal is dynamic and influenced by many sources of variation. The past two decades have seen significant progress being made to cope with this problem using different techniques. From among these techniques, hybrid two types of pattern classification have reported promising results in improving the accuracy result. Although producing considerable improvement, these hybrid techniques are still somewhat restricted in terms of recognition accuracy for large data set. Since previous works have reported substantial examples of successful implementation in combining two classification techniques, this research intends to produce a new ways of hybrid techniques in order to solve the accuracy problem for incremental data set condition. We put forward a new VQ-GMM mixture model to improve recognition rate of the speaker identification system in the chapter. VQ and GMM are widely applied to the speaker identification, but both have some disadvantages. To overcome those shortages, we introduce a new hybrid VQ/GMM model to improve recognition rate of the speaker identification system in the chapter. Although in baseline form, the VQ-based solution is less accurate than the GMM, but it offers simplicity in computation. Besides, after some experiments, we found that VQ and GMM techniques are suitable apply for the speaker independent task. Therefore, we hope to make use of their merits via a hybrid VQ/GMM classifier. There are many forms of GMM and other pattern classification techniques adaptation in the past. In hybrid VQ/GMM, most of them use VQ as an optimization function to reduce Expectation Maximization algorithm in order to improve the training speed (Reynolds and Rose, 1995; J. Pelecano, 2000). Besides, some researchers use GMM as a post-processor after VQ cluster the speech signal into regions (Qiguang Lin et al, 1996). In our proposed hybrid modeling, both VQ model and GMM model will run parallel after signal preprocessing process. A comparison performance of hybrid VQ/GMM, DTW, VQ, GMM and SVM techniques for speaker recognition will be done through the experiments and will reported in this chapter. This chapter is organized as follows. In Section 2, we reviews proposed speaker recognition framework. In Section 3, we discuss how we construct the hybrid modeling for pattern classification. Section 4 shows the experimental result for the comparison performance. Finally, section 5 we concludes our work.

Item Type:Book Section
Uncontrolled Keywords:speaker recognition, biometric identifications, hybrid modeling
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computer Science and Information System (Formerly known)
ID Code:24970
Deposited By: Ms Zalinda Shuratman
Deposited On:23 Apr 2012 08:11
Last Modified:23 Apr 2012 08:11

Repository Staff Only: item control page