Mr. Donald J. Trump Tweet Generation with LSTMs
Text generation is a popular problem in natural language processing with machine learning. It ranges from simple email replies, word suggestions to simulating DNA sequences, and unfortunately fake news. This project aims to implement a generative model to learn the speech style of Mr. Donald Trump based on a dataset of his speeches; and then, automatically generate an unlimited amount of new speeches in the vein of Trump´s previous speeches.
Dataset description
This project uses a dataset from Kaggle that contained Donald J. Trump’s tweets from between May 2009 to August 2018. The dataset originally included 7 columns: source
, text
, created_at
, retweet_count
, favorite_count
, is_retweet
, and id_str
.
Getting Started
This project was originally built on Google Colab. You can either upload the .ipynb
notebooks to Colab or install dependencies locally on your computer.
Using Google Colab
- Upload the
.ipynb
as a notebook in google colab. - Upload the dataset to the right folder.
- Change path to the dataset in the code.
Run locally
You need to have Python installed in your computer.
-
Install
virtualenv
:pip install virtualenv
-
Create a Python virtual environment:
virtualenv venv
-
Activate virtual environment:
- Windows:
cd venv\Scripts activate cd ..\..
- Lunix / Mac:
source venv/bin/activate
-
Install libraries:
pip install -r requirements.txt
Run the code
- Word level training code:
python word_level_text_generation.py
- Character level training code:
python character_level_text_generation.py
Built With
- Tensorflow (Keras) - The Machine Learning framework used
Authors
- Imad Eddine Toubal - Initial work - imadtoubal
- How Lia - Initial work
License
This project is licensed under the MIT License - see the LICENSE file for details
Acknowledgments
- Dataset by David G. on Kaggle
- Inspiration: Generate text from Nietzsche's writings
Happy coding!