Loading...
Research Project
Sistema de Auxílio ao Diagnóstico Médico para Anormalidades Cardíacas Baseado em Deep Learning e Transformada Wavelet
Funder
Authors
Publications
Automatic Speech Recognition for Portuguese: A Comparative Study
Publication . Borghi, Pedro Henrique; Teixeira, João Paulo; Freitas, Diamantino Rui
This paper provides some comparisons of Automatic Speech Recognition
(ASR) services for Portuguese that were developed in the scope of the Safe
Cities project. ASR technology has enabled bi-directional voice-driven interfaces,
and its demand in Portuguese is evident due to the language’s global prominence.
However, the transcription process has complexities, and a high accuracy depends
on the ability of capturing speech variability and language intricacies, while being
rigorous in terms of semantics. The study first describes ASR services/models
by Google, Microsoft, Amazon, IBM, and Voice Interaction regarding their main
features. To compare them, three tests were proposed. Test A uses a small dataset
with six audio recordings to evaluate in terms of word hit rate the accuracy of
online services, with IBM outperforming others (pt-BR: 93.33%). Tests B and C
utilize theMozilla Common Voice database filtered by a keywords’ set to compare
online and offline models for Brazilian and European Portuguese regarding accuracy
(Ratcliff-Obershelp algorithm), Word Error Rate, Match Error Rate, Word
Information Loss, Character Error Rate and Response-Request Ratio. Test B highlights
the higher accuracy of Google Cloud (pt-PT: 94.90%) and Azure (pt-BR:
98.11%). Test C showcases the potential of Voice Interaction’s real-time application
despite its lower accuracy (pt-PT: 78.81%). The tests were carried out using a
framework developed using Python 3.x on a Raspberry Pi 4 model B with a server
desktop and the REST APIs from the companies’ repositories.
Organizational Units
Description
Keywords
Contributors
Funders
Funding agency
Fundação para a Ciência e a Tecnologia
Funding programme
Funding Award Number
2022.12371.BD