TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese

Casanova, Edresson; Junior, Arnaldo Candido; Shulby, Christopher; Oliveira, Frederico Santos de; Teixeira, João Paulo; Ponti, Moacir Antonelli; Aluísio, Sandra

http://hdl.handle.net/10198/25428

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Casanova2022_Article_TTS-PortugueseCorpusACorpusFor.pdf		422.51 KB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Casanova, Edresson

Junior, Arnaldo Candido

Shulby, Christopher

Oliveira, Frederico Santos de

Teixeira, João Paulo

Ponti, Moacir Antonelli

Aluísio, Sandra

Resumo(s)

Speech provides a natural way for human–computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all languages are on the same level when in terms of resources and systems for speech synthesis. This work consists of creating publicly available resources for Brazilian Portuguese in the form of a novel dataset along with deep learning models for end-to-end speech synthesis. Such dataset has 10.5 h from a single speaker, from which a Tacotron 2 model with the RTISI-LA vocoder presented the best performance, achieving a 4.03 MOS value. The obtained results are comparable to related works covering English language and the state-of-the-art in European Portuguese.

Palavras-chave

Corpora Speech synthesis TTS Portuguese

URI

http://hdl.handle.net/10198/25428

Citação

Casanova, Edresson; Junior, Arnaldo Candido; Shulby, Christopher; Oliveira, Frederico Santos de; Teixeira, João Paulo; Ponti, Moacir Antonelli; Aluísio, Sandra (2022). IN PRESS - TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese. Language Resources and Evaluation.

DOI

10.1007/s10579-021-09570-4

Coleções

ESTiG - Artigos em Revistas Indexados à WoS/Scopus

Licença CC

cclicense-by

Métricas Alternativas

Ver registo completo