GithubHelp home page GithubHelp logo

social-media-data-book's Introduction

Mining Social Media

Finding great stories in Internet Data

computer party

About

Mining Social Media will show you the kind of data that can be mined on the social web, the insights that can be gained from it, and the limitations of its scope. You’ll learn how to find out what kind of data is available on popular social media juggernauts like Facebook and Twitter and how to recognize the value of what is measured.

Practical exercises interweave with conceptual lessons that cover ways to use Python to extract data from social media sources, analyze it, and make sense of it visually. You’ll learn how to write a script that taps into an API, how to scrape data from websites, and even how to analyze data from an automated Twitter bot.

This repository holds code and data related to the exercises detailed in the book. It is set to publish at the end of 2019.

Getting started

Computer setup

1. Make sure you have OS level dependencies

  • Python 3
  • more to come

2. Clone this repo

git clone https://github.com/lamthuyvo/social-media-data-book.git
cd social-media-data-book

3. Install required python libraries

Optional but recommended: make a virtual environment using venv.

[more details about the computer setup to come]

Data files

While most coding files are hosted on this repository some data files were too large to be included her. Below are instructions on how to access them:

  • askscience_submissions.csv — This file is required for the data exercises in chapter 8 and 9. If you're working with a downloaded version of this repository, you will need to first create a data inside the chapter08_09 folder, then download the data file askscience_submissions.csv and, lastly, place the data file inside the data folder. You can download the file here. The data was provided by data archivist Jason Baumgartner and represents a small sliver of the data he makes available to academics and researchers at Pushshift.io.

  • iranian_tweets_csv_hashed.csv — This file is required for the data exercises in chapter 10. If you're working with a downloaded version of this repository, you will need to first create a data inside the chapter10 folder, then download the data file iranian_tweets_csv_hashed.csv and, lastly, place it inside the data folder. You can download the file here or directly from Twitter. You can find more information about this data on Twitter's elections integrity page.

Content breakdown

[More to come]

Contact

Please feel free to contact me on Github or via [email protected]

social-media-data-book's People

Contributors

lamthuyvo avatar meli-lewis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.