GithubHelp home page GithubHelp logo

duingstuff / msc-thesis-cryptocurrency-return-forecasting-using-bert-based-sentiment Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 0.0 1.54 MB

Cryptocurrency Return Prediction Using Investor Sentiment Extracted by BERT-Based Classifiers From News Articles, Reddit Posts and Tweets ---- Master's thesis project for the program of M.Sc. Economics and Management Science at Humboldt University of Berlin

Jupyter Notebook 99.43% Python 0.57%
nlp bert datascience cryptocurrency priceprediction returnprediction financialforecasting investmentsimulation tradingstrategy finetune

msc-thesis-cryptocurrency-return-forecasting-using-bert-based-sentiment's Introduction

MSc-Thesis-Cryptocurrency-Price-Forecasting-Using-BERT-Based-Sentiment

Cryptocurrency Return Prediction Using Investor Sentiment Extracted by BERT-Based Classifiers From News Articles, Reddit Posts and Tweets

Master's thesis project for the program of M.Sc. Economics and Management Science at Humboldt University of Berlin

--by Duygu Ider https://www.linkedin.com/in/duyguider/

Please find the paper here: https://arxiv.org/abs/2204.05781

Outline of the project and what each script/notebook does:

PART 1: BERT-Based Sentiment Classification

  1. price_data_scrape.ipynb - Scrape price data for Bitcoin and Ethereum
  2. news_scraper_final.ipynb, reddit_scraper_final.ipynb, twitter_scraper_final.py - Scrape news, Reddit and Tweets data
  3. weak_labels_approach.py - Use Financhial Phrasebank data (Malo et. al, 2014) to label it with pseudo-labels predicted by BART zero-shot classifier, fit a BERT-based classifier, evaluate model performance in the case of weak labels
  4. combine_text_data_zsc_finbert.py - Combine the price and text data to a single dataset, predict sentiment using zero-shot classifier (BART) and FinBERT to assign weak labels
  5. bert_crypto_hyperparam_optimal_and_zsc.ipynb - Perform grid search hyperparameter optimization to the process of fine-tuning BERT-based classifiers. The implemented models are BERT-Unfrozen, BERT-Frozen and BERT-Context

PART 2: Return Prediction and Trading Simulation

  1. data_prep_for_financial_models.py - Prepare the combined dataset as an input for the financial models. Add price, macroeconomic, blockchain features and weekday dummies
  2. return_prediction_trading_simulation.ipynb - Load data, add technical analsis features to the dataset, lag defined features by a certain lag amount, plot some intermediate outputs, perform elimination by variance inflation factor to analyze sentiment feature contribution, fit all cryptocurrency return predictors using Bayesian hyperparameter optimization, perform trading simulation over multiple test periods, create a clearly defined output table of all prediction results
  3. return_prediction_trading_simulation(rnn_added_pipeline_implemented).ipynb_ - RNN and LSTM added as financial forecasting models, compared to the previous script

msc-thesis-cryptocurrency-return-forecasting-using-bert-based-sentiment's People

Contributors

duingstuff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.