GithubHelp home page GithubHelp logo

sandy4321 / twitter-sentiment-analysis-14 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ottoman91/twitter-sentiment-analysis

0.0 1.0 0.0 8.44 MB

A project that carries out sentiment analysis of tweets from Twitter. Completed as an assignment of the Data Manipulation at Scale:Systems and Algorithms MOOC

License: GNU General Public License v2.0

Python 100.00%

twitter-sentiment-analysis-14's Introduction

Twitter-Sentiment-Analysis

This project was completed as part of the "Data Manipulation at Scale: Systems and Algorithms" MOOC on Coursera. The following were the objectives of this project:

  1. Accessing the Twitter Application Programming Interface(API) using Python to download a live stream of Tweets.

  2. Estimating the public's perception (the sentiment) of a particular term or phrase in a tweet.

  3. Analyzing the relationship between location and mood based on a sample of twitter data

##Scripts in this Project The following is a list of the Scripts in this project, along with a brief explanation of what they do.

  • AFINN-111.txt. AFINN is a list of English words rated for valence with an integer between minus five (negative) and plus five (positive). The words have been manually labeled by Finn Arup Nielsen in 2009-2011. The file is tab-separated. We use the AFINN-111.txt file to find out the initial sentiment of tweets.

  • twitterstream.py. This python script downloads the tweets from a Twitter account via using the Twitter API. To use this script, do the following: a. Enter the api_key, the api_secret, the access_token_key and the access_token_secret values for the Twitter application that you have created from your account. b. Output the Twitter stream into a text file via running the following: python twitterstream.py > output.txt

  • tweet_sentiment.py. This python script reads through all of the tweets downloaded, and based on the occurance of words from the AFINN-111 file in the tweets, it assigns a sentiment score to every tweet. Run the script in the following way: python tweet_sentiment.py AFINN-111.txt output.txt

  • term_sentiment.py. This script is used to figure out the sentiments of words in the tweets that are not included in the AFINN-111 file. The following is the logic I used for developing a sentiment metric:

  1. For every tweet, compute the positive as well as the negative scores of any words present whose sentiments are listed in the AFINN text file. Also, note down the number of positive and negative sentiments present in every tweet.
  2. For every tweet, find out the list of words whose sentiments are not calculated. To each of these words, assign the positive and negative sentiment scores,as well as the number of positive and negative words that it is in close proximity to.
  3. After running through all of the tweets,for every word whose sentiments need to be calculated, multiply its total positive sentiment value by the total number of positive words it is in close proximity to and divide this by the total number of tweets in which the word is present. Lets call this number a. Then, multiply the total negative sentiment value by the total number of negative words the word is in close proximity to,and divide this by the total number of tweets in which the word is present. Lets call this number b. The sentiment of every new term would be equal to a - b. Run the script in the following way: python term_sentiment.py AFINN-111.txt output.txt
  • frequency.py This script is used to compute the term frequency of the livestream twitter data that we have downloaded. Run the script in the following way: python frequency.py output.txt

  • happiest_state.py This script uses the user field in every Tweet to determine the state from which the Tweet originated. By using the AFINN-111.txt file to determine the sentiment of every tweet, this script displays the initials of the state that has the highest average positive sentiment.Run the script in the following way: python happiest_state.py AFINN-111.txt output.py

  • top_ten.py This script prints the top ten hashtags from the twitter data downloaded. Run the script in the following way: python top_ten.py output.txt

twitter-sentiment-analysis-14's People

Contributors

ottoman91 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.