GithubHelp home page GithubHelp logo

sandy4321 / twitter-sentiment-analysis-python-1 Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 4.38 MB

Tool to determine and visualize sentiment in tweets. Built using python + d3

Python 91.80% CSS 0.46% JavaScript 0.69% HTML 6.33% Shell 0.63% C 0.09%

twitter-sentiment-analysis-python-1's Introduction

twitter-sentiment-analysis-python

Tool to determine and visualize sentiment in tweets. Built using python + d3.
Make sure to enter in your own twitter access and consumer keys.

=================================

This sentiment analysis tool takes any keyword and uses Twitter's Search API to retrieve the last 100 tweets containing the word. These tweets are then scored and the results visualized to give an overview of the real-time sentiment of the search term queried.

At its core, the sentiment of each tweet is determined by scoring each word that makes up a tweet. The score is determined by matching every word against a list of 2477 words that have been given a predetermined score from +5 to -5. See the table below for a subset of this list:

Term Score
abhor -3
dislike -2
like +2
admire +3

To obtain an even more accurate score, the following four strategies were utilized:

  1. Data cleansing Once tweets were broken down into individual words they were stripped of any punctuation, converted to Unicode lowercase terms and compared against lists of Unicode lowercase terms.

  2. Negation A negation term is one that precedes and negates the meaning of one of the pre-determined scored terms. For example, in the phrase “don’t like” don’t is the negation term. The full list of negation terms used in this tool can be found here.

  3. Intensifier An intensifier is one that precedes and amplifies the meaning of one of the pre-determined scored terms. For example, in the phrase “really like” really is the intensifier. The full list of intensifiers used in this tool can be found here.

The 4 scenarios below showcase how this tool calculates the effect of intensifiers and negation terms on a tweet:

Keep in mind that the score for the term "like" is +2.

negator term = -1 * [Score of term]. ex) don't like = -2
intensifier term = 2 * [Score of term]. ex) really like = +4
negator intensifier term = -0.5 * [Score of term]. ex) don't really like = -1
intensifier negator term = -2 * [Score of term]. ex) really don't like = -4

  1. Slang and indicative terms Looking through a random sample of 100,000 tweets from Twitter's Streaming API there were terms that statistically showed prominence for appearing in either positive or negative tweets. These terms along with their mean score were added to the list of scored terms. See the table below for a subset of this list:

Term Score
luv +2
hehe +2
:) +2
:( -2

Future Improvements In its current version, this tool only analyzes English language tweets. Moreover, the scoring system can be improved in a number of ways ranging from scoring positive or negative phrases vs. individual words, having a multi-dimensional view of sentiment vs. a linear +/- score of sentiment, or having specific sentiment terms or phrases for specific brands or domains (ex: movies) which would be different than other domains.

Technology This web app was built in python and utilizes the d3 JavaScript library as a way to visualize the results via a pie chart and word clouds. This is an open source project available on GitHub.

Thank you for your time and I hope your enjoy using this tool.

twitter-sentiment-analysis-python-1's People

Contributors

dmathewwws avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.