GithubHelp home page GithubHelp logo

mta-exam-scraper's Introduction

mta-exam-scraper

Scrap exam information from MTA website

Demo

asciicast

Requirements

  • python >= 2.7 && python < 3.0
  • python-dev (if you are running on linux)
  • pip
  • scrapy == 0.24.4

Installing python stuff on linux

sudo apt-get install python python-dev python-pip

Installing python stuff on Mac (via brew)

brew install python

Setup

sudo pip install scrapy==0.24.4
git clone https://github.com/ranl/mta-exam-scraper.git

Usage

cd mta-exam-scraper

# print all the megamot ids & names
scrapy crawl exam_spider -t jsonlines -o - -a only_list_megama=1 --loglevel=ERROR

# scrap all the exams from MTA and print the items as json to STDOUT
scrapy crawl exam_spider -t jsonlines -o - --loglevel=ERROR

# crawl all the exams in MTA and print debug logs
scrapy crawl exam_spider

# crawl only the megama with id of 50000118 (Computer Science)
scrapy crawl exam_spider -a megama=50000118

# crawl all the exams in MTA and output each Item as a json onliner in the /tmp/data_from_crawler.json file
crawl exam_spider -o /tmp/data_from_crawler.json -t jsonlines

# print more help about the scrapy crawl command
scrapy crawl exam_spider --help

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.