This repository contains code and corpora for the twitter robot-journalists @DaMataNews, @DaMataReporter and @CoronaReporter. This work was described in the following articles:
INLG 2020 - 13th International Conference on Natural Language Generation
URL: https://www.aclweb.org/anthology/2020.inlg-1.15
ENIAC 2020 - XVII Encontro Nacional de Inteligência Artificial e Computacional
DOI: https://doi.org/10.5753/eniac.2020.12158
Dependencies to run the code may be installed by the following command:
pip install -r requirements.txt
Non-linguistic data about COVID-19 in Brazil and Deforestation in the Legal Amazon area extracted from the web and stored in a private MongoDB database, accessed by the MongoEngine ORM framework. So in order to run the code, make sure to set up a MongoDB database and update the login info on the db/model.py file. Source data for COVID-19 can be retrieved from Worldometers website and data for Amazon deforestation can be retrieved from INPE Terrabrasilis platform.
In the paths.py file, you will see the path for several important files, such as the structuring, lexicalization, references grammas as well as the lexicon and the Twitter API login info. In this file, make sure to add the proper files and keys before run the code.
Once all the dependencies are solved, the robot-journalist can be tested on the Jupyter files covid19.ipynb and deter.ipynb. To be executed on a production environmente, the publisher.py files for COVID-19 and Deforestation can be executed.