Repository logo
 
Loading...
Project Logo
Research Project

Sistema de Auxílio ao Diagnóstico Médico para Anormalidades Cardíacas Baseado em Deep Learning e Transformada Wavelet

Authors

Publications

Automatic Speech Recognition for Portuguese: A Comparative Study
Publication . Borghi, Pedro Henrique; Teixeira, João Paulo; Freitas, Diamantino Rui
This paper provides some comparisons of Automatic Speech Recognition (ASR) services for Portuguese that were developed in the scope of the Safe Cities project. ASR technology has enabled bi-directional voice-driven interfaces, and its demand in Portuguese is evident due to the language’s global prominence. However, the transcription process has complexities, and a high accuracy depends on the ability of capturing speech variability and language intricacies, while being rigorous in terms of semantics. The study first describes ASR services/models by Google, Microsoft, Amazon, IBM, and Voice Interaction regarding their main features. To compare them, three tests were proposed. Test A uses a small dataset with six audio recordings to evaluate in terms of word hit rate the accuracy of online services, with IBM outperforming others (pt-BR: 93.33%). Tests B and C utilize theMozilla Common Voice database filtered by a keywords’ set to compare online and offline models for Brazilian and European Portuguese regarding accuracy (Ratcliff-Obershelp algorithm), Word Error Rate, Match Error Rate, Word Information Loss, Character Error Rate and Response-Request Ratio. Test B highlights the higher accuracy of Google Cloud (pt-PT: 94.90%) and Azure (pt-BR: 98.11%). Test C showcases the potential of Voice Interaction’s real-time application despite its lower accuracy (pt-PT: 78.81%). The tests were carried out using a framework developed using Python 3.x on a Raspberry Pi 4 model B with a server desktop and the REST APIs from the companies’ repositories.

Organizational Units

Description

Keywords

Contributors

Funders

Funding agency

Fundação para a Ciência e a Tecnologia

Funding programme

Funding Award Number

2022.12371.BD

ID