GithubHelp home page GithubHelp logo

brahimmade / masterthesis Goto Github PK

View Code? Open in Web Editor NEW

This project forked from moritzwilksch/masterthesis

0.0 0.0 0.0 1.75 MB

My Master Thesis: Developing a financial market sentiment analysis model for social media content

Python 93.05% TeX 0.41% Makefile 0.30% Jupyter Notebook 6.24%

masterthesis's Introduction

🎓 Master Thesis

My Master Thesis.

The actual write up can be found in this other repo

The Project

Developing a sentiment analysis model for financial social media posts

The Problem

There is loads of research on sentiment analysis models for social media posts (Hutto & Gilbert, 2014; Barbierie et al., 2020) and on sentiment analysis of financial texts like news and corporate filings (Loughran & McDonald, 2011; Araci, 2019). However, the research on financial social media posts (think StockTwits, Reddit r/wallstreetbets, and Twitter) is limited.

The Status-Quo

Researchers often utilize sentiment models from the adjacent domains of finance or generic social media. Therefore, be benchmark the most common models: VADER (Hutto & Gilbert, 2014), NTUSD-Fin (Chen et al., 2018), FinBERT (Araci, 2019), and TwitterRoBERTa (Barbierie et al., 2020)

The Solution

We collect and label 10,000 tweets and train a varietiy of sentiment analysis models comparing their performance and compute footprints. The detailed methodology can be found here. The final models will be open-sourced and availabe for anyone to use as pyFin-sentiment: a python package for sentiment analysis of financial social media posts.

Performance

On Tweets

Out-of-sample ROC AUC of proposed and existing models on the collected dataset of 10,000 tweets.

image

On StockTwits Posts

Out-of-sample ROC AUC of proposed and existing models on a dataset of StockTwits posts.

Using the Fin-SoMe dataset compiled by Chen et al. (2020) image

Resourcefulness

Measured as inference time per sample (ms) on a system with an AMD Ryzen 5 3600 CPU and 64GB of RAM image

pyFin-Sentiment

This work set out to publish a usable model artifact to provide future research with more accurate sentiment assessments. We therefore publish the proposed logistc regression model in an easy-to-use python library called pyFin-Sentiment

References

  1. Araci, D. (2019). Finbert: Financial sentiment analysiswith pre-trained language models. arXiv preprint arXiv:1908.10063
  2. Barbieri, F., Camacho-Collados, J., Neves, L., & Espinosa-Anke, L. (2020). Tweeteval: Unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421.
  3. Chen, C.-C., Huang, H.-H., & Chen, H.-H. (2018). Ntusd-fin: a market sentiment dictionary for financial social media data applications. In Proceedings of the 1st financial narrative processing workshop (fnp 2018).
  4. Chen, C.-C., Huang, H.-H., & Chen, H.-H. (2020). Issues and perspectives from 10,000 annotated financial social media data. In Proceedings of the 12th language resources and evaluation conference (pp. 6106–6110).
  5. Hutto, C., &Gilbert, E. (2014). Vader: Aparsimonious rule-based model for sentiment analysis of social media text. InProceedings ofthe international aaai conference on web andsocial media (Vol. 8, pp. 216–225).
  6. Loughran, T.,&McDonald, B. (2011).When is aliabilitynotaliability? textual analysis, dictionaries, and 10-ks. The Journal offinance, 66(1), 35–65.

masterthesis's People

Contributors

moritzwilksch avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.