GithubHelp home page GithubHelp logo

gravityfire-tw / scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mnmldave/scraper

0.0 2.0 0.0 1.02 MB

Simple web scraping for Google Chrome.

Home Page: http://mnmldave.github.com/scraper

License: BSD 3-Clause "New" or "Revised" License

Ruby 0.77% HTML 2.78% JavaScript 93.44% CSS 3.01%

scraper's Introduction

Scraper

A Google Chrome extension for getting data out of web pages and into spreadsheets.

Usage

Highlight a part of the page that is similar to what you want to scrape. Right-click and select the "Scrape selected..." item. The scraper window will appear, showing you the initial results. You can export the table to by pressing the "Export to Google Docs..." button or use the left-hand pane to further refine or customize your scraping.

The "Selector" section lets you change which page elements are scraped. You can specify the query as either a jQuery selector, or in XPath.

You may also customize the columns of the table in the "Columns" section. These must be specified in XPath. You can specify names for columns if you would like.

Selecting the "Exclude empty results" filter will prevent any matches that contain no column values from appearing in the table.

After making any customizations, you must press the "Scrape" button to update the table of results.

Download

Download the extension from http://chrome.google.com/extensions/detail/mbigbapnjcgaffohmbkdlecaccepngjd.

Get the sources from https://github.com/mnmldave/scraper.

Building

You don't need to 'build' this extension per se. To test it out, you first need to navigate to chrome://extensions from Google Chrome then expand "Developer Mode". Click the "Load unpacked extension..." button and point it to the src directory.

Learn more about plugin development from the Google Chrome Extensions page.

A Rakefile is included for compiling the Google Chrome extension into a zip file. It also does javascript and css minification.

License

Scraper is open-sourced under a BSD license which you can find in LICENSE.txt.

Credits

Many of the icons used in this extension are from the generous Yusuke Kamiyamane.


Copyright (c) 2010 David Heaton ([email protected])

scraper's People

Contributors

mnmldave avatar

Watchers

James Cloos avatar Gravityfire avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.