| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 841.64 KB | Adobe PDF |
Advisor(s)
Abstract(s)
In this study, we explore the capabilities of speaker recognition
technology for biometric authentication developing speaker
recognition-based access control systems and serving as a resource for
future research and improvements in secure and efficient speaker identification
solutions. We focused on developing and evaluating machine
learning and deep learning models for speaker identification. The models
were trained and tested on private datasets with 32 speakers and public
datasets with 1251 to 6112 speakers. The Gaussian Mixture Model performed
well with our private datasets, with 93,10%, and 95% accuracy in
correctly identifying the speakers. The Multilayer Perceptron achieved a
peak accuracy of 93.33% on the Framed Trim private dataset. The VGGM
model, after initial training on larger datasets, achieved an accuracy of
90.34% and 98.33% on our private datasets. At last, the model ResNet50
slightly outperformed the other models on two versions of our private
dataset, achieving accuracies of 97.93% and 100%.
Description
Keywords
Speaker Identification Convolutional Neural Network Deep Learning
Pedagogical Context
Citation
Manfron, Enrico; Teixeira, João Paulo; Minetto, Rodrigo (2024). Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets. In 3rd International Conference on Optimization, Learning Algorithms and Applications (OL2A 2023). Cham: Springer Nature, Vol. 2, p. 195–210. ISBN 978-3-031-53035-7
Publisher
Springer Nature
