ESTiG - Publicações em Proceedings Indexadas à WoS/Scopus
URI permanente para esta coleção:
Navegar
Percorrer ESTiG - Publicações em Proceedings Indexadas à WoS/Scopus por autor "Abreu, J.L. Pio"
A mostrar 1 - 2 de 2
Resultados por página
Opções de ordenação
- Comparative Analysis of Windows for Speech Emotion Recognition Using CNNPublication . Teixeira, Felipe; Soares, Salviano Pinto; Abreu, J.L. Pio; Oliveira, Paulo M.; Teixeira, João PauloThe paper presents the comparison of accuracy in the Speech Emotion Recognition task using the Hamming and Hanning windows for framing the speech and determining the spectrogram to be used as input of a convolutional neural network. The detection of between 4 and 10 emotional states was tested for both windows. The results show significant differences in accuracy between the two window types and provide valuable insights for the development of more efficient emotional state detection systems. The best accuracy between 4 and 10 emotions was 64.1% (4 emotions), 57.8% (5 emotions), 59.8% (6 emotions), 48.4% (7 emotions), 47.8% (8 emotions), 51.4% (9 emotions), and 45.9% (10 emotions). These accuracy is at the state-of-the art level.
- F0, LPC, and MFCC analysis for emotion recognition based on speechPublication . Teixeira, Felipe; Teixeira, João Paulo; Soares, Salviano; Abreu, J.L. PioIn this work, research was done to understand what is needed to build a database to recognise emotions through speech. Some features that can highlight a good success rate for emotion recognition through speech were investigated. Also studied were some characteristics (symptoms) that can be associated with a specific emotional state. On the other hand, we also studied some features that can be used to identify some emotional states. A System Emotion Recognition (SER) was built with SVM, and the binary analysis was compared with a multi-category analysis. The binary analysis achieved an accuracy of 87.5% and the multi-class 42.6%. The parameters Fundamental Frequency-F0, Linear Predictive Coefficients (LPC), and Mel Frequency Cepstral Coeficients (MFCC) were used. The modest accuracy of this work was achieved using only F0, LPC and MFCC features.
