GithubHelp home page GithubHelp logo

steve-wilson / get-tweets Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 1.0 25 KB

Single Python script to get tweet JSON objects from a list of tweet IDs

License: GNU General Public License v3.0

Python 100.00%

get-tweets's Introduction

get-tweets

Simple Python (v3.7+) script to get tweet JSON objects from a file containing a list of tweet IDs, one per line.

Input: a single column plain text file, or a csv/tsv file where one of the columns contains the tweet IDs.

Output: A file containing the full JSON object for each tweet, one JSON object per line.

Purpose: For cases where you want to receive a Twitter dataset from someone, but they are only allowed to share the IDs with you due to the Twitter API terms of service.

Setup

Install tweepy

Run pip install tweepy

Get your API keys

Learn about the Twitter API, including how to apply for your own keys, here.

Put them in a file of your choosing, but the format should match the example_keys.txt file included in this repo.

Note: Don't push the file containing your keys to a public repository! These should be kept private.

Usage

usage: get_tweets_by_id.py [-h] [--sep SEP] [--col COL] [--quiet]
                           tweet_id_file output_json_file keys_file

positional arguments:
  tweet_id_file      Path to file containing Tweet Ids, one per line.
  output_json_file   Where to create output file.
  keys_file          Path to keys file. (see README or example_keys.txt)

optional arguments:
  -h, --help         show this help message and exit
  --sep SEP, -s SEP  Column separator in tweet_id_file. (default:tab)
  --col COL, -c COL  0-based index of column of tweet_id_file where Tweet Id
                     can be found. (default:0)
  --quiet, -q        Don't print progress messages while collecting tweets.

Example

Your tweet IDs are in a file called tweet_ids.txt, with one ID per line. You have added your keys to a file called my_keys.txt in the same format as the example_keys.txt file. Then you can run:

python get_tweets_by_id.py tweet_ids.txt my_tweets.json my_keys.txt

and a file called tweet_ids.txt will be created containing the json objects for each available tweet in your tweet_ids.txt file.

FAQ

I have N tweet IDs, why am I only getting <N tweets?

You will only be able to retrieve tweets that are still publicly available on Twitter. This means no deleted tweets, tweets by users that have been banned, or tweets from users who have switched their accounts to private. These things are all fairly common, so it is likely that you won't be able to get every tweet, and it becomes more likely over time.

Will the csv mode work if I'm using quoted fields that contain commas?

No, this is not supported currently. It will only end up working if the tweet ID column comes before the columns that potentially contain quoted commas. The easiest solution for you might be to copy the single column of Tweet IDs into a new file and use that as your input. You can also do smarter csv handling using the csv Python library.

get-tweets's People

Contributors

steve-wilson avatar

Stargazers

Manuel  avatar Eslam Hussein avatar Anna Smirnova avatar Aditya Chetan avatar Ahmed Alajrami avatar Soujanya Poria avatar Santiago Castro avatar

Watchers

 avatar

Forkers

manueltonneau

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.