Universiti Teknologi Malaysia Institutional Repository

Study on gender identification based on audio recordings using Gaussian Mixture Model and Mel Frequency Cepstrum Coefficient technique

Rokanatnam, Thurgeaswary and Mammi, Hazinah Kutty (2021) Study on gender identification based on audio recordings using Gaussian Mixture Model and Mel Frequency Cepstrum Coefficient technique. International Journal of Innovative Computing, 11 (2). pp. 35-41. ISSN 2180-4370

[img]
Preview
PDF
468kB

Official URL: http://dx.doi.org/10.11113/ijic.v11n2.343

Abstract

Speaker recognition is an ability to identify speaker’s characteristics based from spoken language. The purpose of this study is to identify gender of speakers based on audio recordings. The objective of this study is to evaluate the accuracy rate of this technique to differentiate the gender and also to determine the performance rate to classify even when using self-acquired recordings. Audio forensics uses voice recordings as part of evidence to solve cases. This study is mainly conducted to provide an easier technique to identify the unknown speaker characteristics in forensic field. This experiment is fulfilled by training the pattern classifier using gender dependent data. In order to train the model, a speech database is obtained from an online speech corpus comprising of both male and female speakers. During the testing phase, apart from the data from speech corpus, audio recordings of UTM students will too be used to determine the accuracy rate of this speaker identification experiment. As for the technique to run this experiment, Mel Frequency Cepstrum Coefficient (MFCC) algorithm is used to extract the features from speech data while Gaussian Mixture Model (GMM) is used to model the gender identifier. Noise removal was not used for any speech data in this experiment. Python software is used to extract using MFCC coefficients and model the behavior using GMM technique. Experiment results show that GMM-MFCC technique can identify gender regardless of language but with varying accuracy rate.

Item Type:Article
Uncontrolled Keywords:speaker recognition, feature extraction, Mel Frequency Cepstrum Coefficient (MFCC), Gausian Mixture Model (GMM), gender identification
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions:Computing
ID Code:97789
Deposited By: Yanti Mohd Shah
Deposited On:31 Oct 2022 08:51
Last Modified:31 Oct 2022 08:51

Repository Staff Only: item control page