GithubHelp home page GithubHelp logo

crowdtangle_domain_pull's Introduction

Instructions

  1. Clone this github repository

  2. Download python3 (see https://www.python.org/downloads/)

  3. Install requirements using pip

pip3 install -r requirements.txt

  1. Create a CrowdTangle API Key (see: https://authority.site/a-beginners-guide-to-the-crowdtangle-api-565/)

  2. Add API key to config.py, e.g. by:

    mv example_config.py config.py

    Adding API key to config.py

  3. Run script using instructions below, e.g. python3 query_crowdtangle.py -f iffy_domains.txt

To use script

Usage: query_crowdtangle.py [options]

Options:
  -h, --help            show this help message and exit
  -s START_DATE, --start_date=START_DATE
                        Optional. Start date for querying, in the form \%Y-\%m-\%d,
                        Defaults to Today - 1 week
  -e END_DATE, --end_date=END_DATE
                        Optional. End date for querying, in the form \%Y-\%m-\%d,
                        Defaults to Today
  -q QUERY, --query=QUERY
                        Optional. Query string, boolean style e.g. "biden AND (vote OR
                        ballot)". Defaults to the empty string. 
  -d DOMAINS, --domains=DOMAINS
                        Optional. Comma separated list of domains (e.g.
                        nytimes.com,breitbart.com. Need either -d or -f flag to run.  
  -f DOMAIN_FILE, --domain_file=DOMAIN_FILE
                        Optional. File with list of domains, one domain per line. Need either -f or -d to run. 
  -o OUTPUT_FILE, --output_file=OUTPUT_FILE
                        Optional. Name of the output file to output results (defaults to
                        output/posts_<query>_<datetime>.tsv)
  -p, --include_page_info
                        Optional. Include page related info for query. Defaults to False. 
  -r PAGE_FILE, --page_file=PAGE_FILE
                        Optional. Name of the page file to output page results. Defaults
                        to output/pages_<query>_<datetime>.tsv. 
  -c COUNT, --count=COUNT
                        Optional. Number of posts to return. Defaults to 20. 
  -l LIMIT, --limit=LIMIT
                        Optional. Number of URLs per domain to return (for variety). Defaults to no limit.  

Example:

python3 query_crowdtangle.py -f iffy_domains.txt -c 100 -l 5 -q "biden AND vote"

==> Returns a TSV in the "output" folfer of top 100 posts by engagementwith unique URLs from domains in file "iffy_domains.txt" that contain both "biden" and "vote". At most 5 URLs for domain for variety.

crowdtangle_domain_pull's People

Contributors

jennyallen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.