Repository logo
 
Loading...
Thumbnail Image
Publication

Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets

Use this identifier to reference this record.

Advisor(s)

Abstract(s)

In this study, we explore the capabilities of speaker recognition technology for biometric authentication developing speaker recognition-based access control systems and serving as a resource for future research and improvements in secure and efficient speaker identification solutions. We focused on developing and evaluating machine learning and deep learning models for speaker identification. The models were trained and tested on private datasets with 32 speakers and public datasets with 1251 to 6112 speakers. The Gaussian Mixture Model performed well with our private datasets, with 93,10%, and 95% accuracy in correctly identifying the speakers. The Multilayer Perceptron achieved a peak accuracy of 93.33% on the Framed Trim private dataset. The VGGM model, after initial training on larger datasets, achieved an accuracy of 90.34% and 98.33% on our private datasets. At last, the model ResNet50 slightly outperformed the other models on two versions of our private dataset, achieving accuracies of 97.93% and 100%.

Description

Keywords

Speaker Identification Convolutional Neural Network Deep Learning

Pedagogical Context

Citation

Manfron, Enrico; Teixeira, João Paulo; Minetto, Rodrigo (2024). Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets. In 3rd International Conference on Optimization, Learning Algorithms and Applications (OL2A 2023). Cham: Springer Nature, Vol. 2, p. 195–210. ISBN 978-3-031-53035-7

Organizational Units

Journal Issue