| Name: | Description: | Size: | Format: | |
|---|---|---|---|---|
| 1.7 MB | Adobe PDF |
Advisor(s)
Abstract(s)
Over the last two years, the COVID-19 pandemic
has affected hundreds of millions of people around the world.
As in many crises, people turn to social media platforms, like
Twitter, to communicate and share information. Twitter datasets
have been used over the years in many research studies to
extract valuable information. Therefore, several large COVID-
19 Twitter datasets have been released over the last two years.
However, none of these datasets contains only Portuguese Tweets,
despite the Portuguese Language being reported as one of the
top five languages used on Twitter. In this paper, we present
the first large-scale Portuguese COVID-19 Twitter dataset. The
dataset contains over 19 million Tweets spanning 2020 and 2021,
allowing the entire pandemic to be analyzed. We also conducted
a sentiment analysis on the dataset and correlated the various
spikes in Tweet count and sentiment scores to various news
articles and government announcements in Portugal and Brazil.
The dataset is available at: https://github.com/bioinformaticsua/
Portuguese-Covid19-Dataset
Description
Keywords
COVID-19 Twitter Dataset Sentiment analysis
Pedagogical Context
Citation
Jonker, Richard A.A.; Poudel, Roshan; Fajarda, Olga; Matos, Sérgio; Oliveira, José Luís; Lopes, Rui Pedro (2022). Portuguese twitter dataset on COVID-19. In 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). p. 332-338. ISBN 978-1-6654-5661-6.
Publisher
IEEE
