This is the repository of the paper:
Negin Karisani, Payam Karisani. 2020. Mining Coronavirus (COVID-19) Posts in Social Media, arXiv.
You can access our dataset and pre-trained BERT model here.
A short description of the files:
- tweet_ids.txt.zip: Conatins the tweet ids mentioned in the paper. One tweet id per line.
- bert-base-uncased-corona.zip: The pre-trained BERT model discussed in the paper. We used the pytorch implementation of BERT, available at huggingface Github page.
We have uploaded the updated dataset, here. It contains about 9 million tweets published between Jan 27 and April 20, see the paper for more details,