Evaluation of Machine Learning Algorithms using Combined Feature Extraction Techniques for Speaker Identification

Iwok, Unwana Ubong and Udofia, Kingsley Monday and Obot, Akaninyene Bernard and Udofia, Kufre Michael and Michael, Unyime Anietie and Kingsley, Akpabio Itoro (2023) Evaluation of Machine Learning Algorithms using Combined Feature Extraction Techniques for Speaker Identification. Journal of Engineering Research and Reports, 25 (8). pp. 197-216. ISSN 2582-2926

[thumbnail of Iwok2582023JERR105995.pdf] Text
Iwok2582023JERR105995.pdf - Published Version

Download (965kB)

Abstract

The aim of this study is to evaluate and compare machine learning algorithms when various feature extraction techniques are employed together and determine the optimal feature combinations for the models studied. The TIMIT online database was used where 5 male and 5 female non-native English speakers from five American locations were selected. Each speaker had ten 3-second utterances, totaling 500. Mel frequency cepstral coefficients (MFCC), linear predictive cepstral coefficients (LPCC), gammatone frequency cepstral coefficients (GFCC), discrete wavelet transforms (DWT) and pitch features were extracted using MATLAB and concatenated. The concatenated features were used to train and evaluate three classifier models—Random Forest (RF), Linear Discriminant Analysis (LDA), and Logistic Regression (LR)—using Python software. The results obtained showed that as the number of features combinations increased, the models’ performances improved as well. This improved performance was observed when all the cepstral features were part of the combinations. This implies that cepstral features are more robust and improve speaker identification systems. The best average score of accurate predictions of ≈ 76% for the LR model was obtained for the MGL (39 features) features combination and dropped to 70% for the highest number of feature combinations MGDLP (53 features). This indicates that more training data improves system performance, however, too much data does not translate to even better performance because the system will eventually achieve its peak performance. This information is useful for applications where limited data can present a problem.

Item Type: Article
Subjects: Institute Archives > Engineering
Depositing User: Managing Editor
Date Deposited: 16 Oct 2023 04:49
Last Modified: 16 Oct 2023 04:49
URI: http://eprint.subtopublish.com/id/eprint/3179

Actions (login required)

View Item
View Item