Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets

Manfron, Enrico; Teixeira, João Paulo; Minetto, Rodrigo

http://hdl.handle.net/10198/30380

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets.pdf		841.64 KB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Manfron, Enrico

Teixeira, João Paulo

Minetto, Rodrigo

Resumo(s)

In this study, we explore the capabilities of speaker recognition technology for biometric authentication developing speaker recognition-based access control systems and serving as a resource for future research and improvements in secure and efficient speaker identification solutions. We focused on developing and evaluating machine learning and deep learning models for speaker identification. The models were trained and tested on private datasets with 32 speakers and public datasets with 1251 to 6112 speakers. The Gaussian Mixture Model performed well with our private datasets, with 93,10%, and 95% accuracy in correctly identifying the speakers. The Multilayer Perceptron achieved a peak accuracy of 93.33% on the Framed Trim private dataset. The VGGM model, after initial training on larger datasets, achieved an accuracy of 90.34% and 98.33% on our private datasets. At last, the model ResNet50 slightly outperformed the other models on two versions of our private dataset, achieving accuracies of 97.93% and 100%.

Palavras-chave

Speaker Identification Convolutional Neural Network Deep Learning

URI

http://hdl.handle.net/10198/30380

Citação

Manfron, Enrico; Teixeira, João Paulo; Minetto, Rodrigo (2024). Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets. In 3rd International Conference on Optimization, Learning Algorithms and Applications (OL2A 2023). Cham: Springer Nature, Vol. 2, p. 195–210. ISBN 978-3-031-53035-7