salahnouari / bert-sentiment-trading-strategy Goto Github PK

View Code? Open in Web Editor NEW

Twitter sentiment analysis trading strategy using a fine-tuned BERT model with 110 million parameters. Backtested on 2017 SPY and DJIA data using 6M+ tweets.

Jupyter Notebook 100.00%

bert-sentiment-trading-strategy's Introduction

BERT-Sentiment-Trading-Strategy

Final Project for FRE-7773: Machine Learning for Financial Engineering with Professor Sandeep Jain. Built a Twitter sentiment analysis trading strategy using a a fine-tuned BERT model with 110 million parameters. Backtested the strategy on 2017 SPY and DJIA returns using 6M+ tweets for sentiment signals.

Instructions:

The first step is to import the BERT model and fine-tune its weights using tagged Twitter data. The tagged data I used is found in stock_tweets.csv. After fine-tuning, save the weights to disk so they can be accessed in the main trading strategy notebook.
The main trading strategy is defined in TradingStrategy.ipynb. The methodology is to read through all stock-related tweets in a given lookback period (e.g: 1 day, 4 days, 1 week, etc...) and gauge whether these tweets are more positive or negative than usual. If the positivity score is above a certain threshold, we long an ETF such as SPY or DJIA. If the positivity score is below a certain threshold, we go short. If there is no noticeably positive or negative sentiment, we hold the current position.
For my backtesting, I used DJIA_price_data.csv and SPY_price_data.csv. The actual tweet dataset I used for my backtesting is far too large for Github uploads. Refer to the Cassandra project (Cresci et al., 2018) for similar datasets. A more practical idea would be to use recent tweet data to build an actual trading model rather than simply backtesting.
The final section of the notebook is devoted to comparing backtesting results to other benchmarks and ML trading strategies.

Recommend Projects