GithubHelp home page GithubHelp logo

anfederico / stocktalk Goto Github PK

View Code? Open in Web Editor NEW
867.0 72.0 201.0 1.23 MB

Data collection tool for social media analytics

License: MIT License

Python 37.75% JavaScript 33.70% HTML 28.55%
data-mining sentiment-analysis twitter

stocktalk's Introduction

                                                Python Dependencies GitHub Issues License

Purpose

Stocktalk is a visualization tool that tracks tweet volume and sentiment on Twitter, given a series of queries.

It does this by opening a local websocket with Twitter and pulling tweets that contain user-specified keywords. For example, I can tell Stocktalk to grab all tweets that mention Ethereum and periodically tally volume and measure average sentiment every 15 minutes.

It will then record this data continuously and update an online database that can be used to visualize the timeseries data via an interactive Flask-based web application.

Demo

https://anfederico.github.io/Stocktalk/

Prerequisites

Stocktalk requires API credentials with Twitter and Mlab

Twitter Steps (Creating an application)

  1. Sign into Twitter at apps.twitter.com
  2. Create a new application and fill out details
  3. Generate an access token
  4. Save the following information
    • Consumer Key
    • Consumer Secret
    • Access Token
    • Access Token Secret

Mlab Steps (Setting up an online database)

  1. Make an account at https://mlab.com
  2. Create a new deployment in sandbox mode
  3. Add a database user to your deployment
  4. Save the following information
    • Mongo deployment server
    • Mongo deployment id
    • Mongo deployment client
    • Deployment user
    • Deployment pass

Download

# Clone repository and install dependencies
$ git clone https://github.com/anfederico/Stocktalk
$ pip install -r Stocktalk/requirements.txt

# Install natural language toolkit sentiment corpus
$ python -m nltk.downloader vader_lexicon

Edit Settings

/stocktalk
└── /scripts
    └── settings.py
# Mongo
mongo_server = 'ds254236.mlab.com'
mongo_id     =  54236
mongo_client = 'stocktalk'
mongo_user   = 'username'
mongo_pass   = 'password'

# Twitter
api_key             = ''
api_secret          = ''
access_token        = ''
access_token_secret = ''
credentials = [api_key, api_secret, access_token, access_token_secret]

Code Examples

Twitter Streaming

This file opens the websocket and writes to the online databse until manually interrupted

/stocktalk
└── listen.py

$ python listen.py
from scripts import settings

# Each key or category corresponds to an array of keywords used to pull tweets
queries = {'ETH': ['ETH', 'Ethereum'],
           'LTC': ['LTC', 'Litecoin'],
           'BTC': ['BTC', 'Bitcoin'],
           'XRP': ['XRP', 'Ripple'],
           'XLM': ['XLM', 'Stellar']}

# Aggregate volume and sentiment every 15 minutes
refresh = 15*60

streaming.streamer(settings.credentials, 
                   queries, 
                   refresh, 
                   sentiment=True, 
                   debug=True)

Realtime Visualization

This file initiates a local web-application which pulls data from the online database

/stocktalk
└── app.py

$ python app.py

Underlying Features

Text Processing
t1 = "@TeslaMotors shares jump as shipments more than double! #winning"
print(process(t1))

t2 = "Tesla announces its best sales quarter: http://trib.al/RbTxvSu $TSLA" 
print(process(t2))

t3 = "Tesla $TSLA reports deliveries of 24500, above most views."
print(process(t3))
shares jump as shipments more than double winning
tesla announces its best sales quarter
tesla reports deliveries of number above most views
Sentiment Analysis
t1 = "shares jump as shipments more than double winning"
print(sentiment(t1))

t2 = "tesla reports deliveries of number above most views"
print(sentiment(t2))

t3 = "not looking good for tesla competition on the rise"
print(sentiment(t3))
0.706
0.077
-0.341

stocktalk's People

Contributors

anfederico avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stocktalk's Issues

New to python

hi
your app seems very useful for my research work, I made successfully setup , need some guideline to run program

Installation
pip install stocktalk .... Done
Download Corpus
python -m nltk.downloader vader_lexicon .... Done

Twitter Streaming

Problem : In which file I should find and update API details ?
and then how can I run application to collect data ?

Can you please share some details ?

setup problem

hello , i have problem setup Stocktalk , you see ,
i did
pip install stocktalk
then : python -m nltk.downloader vader_lexicon
but when i import stocktalk , i get this :

import stocktalk
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.5/dist-packages/stocktalk/init.py", line 2, in
from .visualize import visualize
File "/usr/local/lib/python3.5/dist-packages/stocktalk/visualize.py", line 1, in
from bokeh.layouts import row, column
ImportError: No module named 'bokeh'

please help , and thanks in advance

Crawl Tweets in the past

Hi Anfederico,

is there any (simple) way to crawl tweets from the past. So, that I don't need to run the listen.py script for 1 or 2 years first, before i am able to run clairvoyant.

Kind Regards
Alex

no output

I do all Prerequisites and I run this command python listen.py but no output for an hour just write Streaming Now...

Import error

I'm receiving import errors after "pip install stocktalk" and "stocktalk-corpus" in a conda environment as well as global install.

After "from stocktalk import TwitterAxe" attempt:
ImportError: No module named 'TwitterAxe'

Error after running example

I keep getting an error that TwitterAxe cannot be found. Its very strange because I have installed all the requirements.txt and pip installed stocktalk -- all in a virtualenv. All paths look correct. Any ideas?

(venv) [pez:~/dev/stocktalk]$ python ./mine.py (master✱)
Traceback (most recent call last):
File "./mine.py", line 1, in
from stocktalk import TwitterAxe
File "/Users/pez/dev/Stocktalk/stocktalk/init.py", line 1, in
from TwitterAxe import TwitterAxe
ImportError: No module named 'TwitterAxe'

streaming.py

Hi,

I am using this beautiful piece of work in one of my project. I want to understand the underlying flow of streaming.py file.

  •   What are the use of differnet functions :
              - get_tracker(queries)
              - get_reverse(queries)
              - elapsed_time(start)
              - process(text)
              - on_status(self, status)
              - streamer()
    
  • How and at which part the data from twitter are being pulled? ( with respect to function and class used in it and approach to pick relevant tweets after each refresh interval of 15 min)

  • Is it possible to store the raw tweet as well in the mongodb, instead of storing only the volume and sentiment of the tweets?

  • How the text inputs are being processed?

  • How and where sentiment analysis is being applied ?

Kindly help me with these queries!

Thanks,

Neel

Error after specific example

changed code is....
rom stocktalk import streaming

Credentials to access Twitter API

API_KEY = 'XXXXXXXXXX'(there was mine)
API_SECRET = 'XXXXXXXXXX'(there was mine)
ACCESS_TOKEN = 'XXXXXXXXXX'(there was mine)
ACCESS_TOKEN_SECRET = 'XXXXXXXXXX'(there was mine)
credentials = [API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET]

First element must be ticker/name, proceeding elements are extra queries

FIRSTY = ['FIRSTY', 'Samsung']

Variables

tickers = [FIRSTY] # Used for identification purposes
queries = FIRSTY # Filters tweets containing one or more query
refresh = 30 # Process and log data every 30 seconds

Create a folder to collect logs and temporary files

path = "/home/name2/stalktwit/"

streaming(credentials, tickers, queries, refresh, path,
realtime=True, logTracker=True, logTweets=True, logSentiment=True, debug=True)

/////////
I changed the variable from 'TSLA' to 'FIRSTY'
and then there is errer that 'Got an error with status code: 420'
When my variable name was 'FIRS' or 'FIR', there was not error.
Thank you

Visualation Problem

`

from stocktalk.visualize import readUpdates, extrapolate, getPlot

from stocktalk import streaming
from stocktalk import visualize

Credentials to access Twitter API

API_KEY = ''
API_SECRET = '
'
ACCESS_TOKEN = '
'
ACCESS_TOKEN_SECRET = '
**********'
credentials = [API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET]

First element must be ticker/name, proceeding elements are extra queries

TSLA = ['TSLA', 'Tesla']
SNAP = ['SNAP', 'Snapchat']
AAPL = ['AAPL', 'Apple']
AMZN = ['AMZN', 'Amazon']

Variables

tickers = [TSLA,SNAP,AAPL,AMZN] # Used for identification purposes
queries = TSLA+SNAP+AAPL+AMZN # Filters tweets containing one or more query
refresh = 30 # Process and log data every 30 seconds

Create a folder to collect logs and temporary files

path = "./data"

streaming(credentials, tickers, queries, refresh, path,
realtime=True, logTracker=True, logTweets=True, logSentiment=True, debug=True)

tickers = ['TSLA','SNAP','AAPL','AMZN']
refresh = 30

visualize(tickers, refresh, path)`

this my code I run bokeh server. but ı only got empty page. what ı am doing wrong ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.