Logo do repositório
 
A carregar...
Miniatura
Publicação

Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets

Utilize este identificador para referenciar este registo.

Orientador(es)

Resumo(s)

In this study, we explore the capabilities of speaker recognition technology for biometric authentication developing speaker recognition-based access control systems and serving as a resource for future research and improvements in secure and efficient speaker identification solutions. We focused on developing and evaluating machine learning and deep learning models for speaker identification. The models were trained and tested on private datasets with 32 speakers and public datasets with 1251 to 6112 speakers. The Gaussian Mixture Model performed well with our private datasets, with 93,10%, and 95% accuracy in correctly identifying the speakers. The Multilayer Perceptron achieved a peak accuracy of 93.33% on the Framed Trim private dataset. The VGGM model, after initial training on larger datasets, achieved an accuracy of 90.34% and 98.33% on our private datasets. At last, the model ResNet50 slightly outperformed the other models on two versions of our private dataset, achieving accuracies of 97.93% and 100%.

Descrição

Palavras-chave

Speaker Identification Convolutional Neural Network Deep Learning

Contexto Educativo

Citação

Manfron, Enrico; Teixeira, João Paulo; Minetto, Rodrigo (2024). Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets. In 3rd International Conference on Optimization, Learning Algorithms and Applications (OL2A 2023). Cham: Springer Nature, Vol. 2, p. 195–210. ISBN 978-3-031-53035-7

Unidades organizacionais

Fascículo