GithubHelp home page GithubHelp logo

jayvdb / strudel.scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cmustrudel/strudel.scraper

0.0 2.0 0.0 340 KB

Python interfaces to GitHub, Bitbucket and Gitlab API

License: GNU General Public License v3.0

Makefile 0.87% Python 99.13%

strudel.scraper's Introduction

Python interface for code hosting platforms API

It is intended to facilitate research of Open Source projects. At this point, it is basically functional but is missing:

  • tests
  • documentation
  • good architecture

Feel free to contribute any of those.

Installation

pip install --user --upgrade strudel.scraper

Usage

import stscraper as scraper
import pandas as pd

gh_api = scraper.GitHubAPI()
# so far only GiHub, Bitbucket and Gitlab are supported
# bb_api = scraper.BitbucketAPI()
# gl_api = scraper.GitLabAPI()

# repo_issues is a generator that can be used
# to instantiate a pandas dataframe
issues = pd.DataFrame(gh_api.repo_issues('cmustrudel/strudel.scraper'))

Settings

GitHub and GitLab APIs limit request rate for unauthenticated requests (although GitLab limit is much more generous). There are several ways to set your API keys, listed below in order of priority.

Important note: API objects are reused in subsequent calls. The same keys used to instantiate the first API object will be used by ALL other instances.

Class instantiation:

import stscraper

gh_api = stscraper.GitHubAPI(tokens="comman-separated list of tokens")

At runtime:

import stscraper
import stutils

# IMPORTANT: do this before creation of the first API object!
stutils.CONFIG['GITHUB_API_TOKENS'] = 'comma-separated list of tokens'
stutils.CONFIG['GITLAB_API_TOKENS'] = 'comma-separated list of tokens'

# any api instance created after this, will use the provided tokens
gh_api = stscraper.GitHubAPI()

settings file:

project root
 \
  |- my_module
  |   \- my_file.py
  |- settings.py
# settings.py

GITHUB_API_TOKENS = 'comma-separated list of tokens'
GITLAB_API_TOKENS = 'comma-separated list of tokens'
# my_file.py
import stscraper

# keys from settings.py will be reused automatically
gh_api = stscraper.GitHubAPI()

Environment variable:

# somewhere in ~/.bashrc
export GITHUB_API_TOKENS='comma-separated list of tokens'
export GITLAB_API_TOKENS='comma-separated list of tokens'
# somewhere in the code
import stscraper

# keys from environment variables will be reused automatically
gh_api = stscraper.GitHubAPI()

Hub config:

If you have hub installed and everything else fails, its configuration will be reused for GitHub API.

strudel.scraper's People

Contributors

user2589 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.