GithubHelp home page GithubHelp logo

anthrax3 / node-google-search-scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from thibauts/node-google-search-scraper

0.0 0.0 0.0 6 KB

Google search scraper with captcha solving support

License: MIT License

JavaScript 100.00%

node-google-search-scraper's Introduction

google-search-scraper

Google search scraper with captcha solving support

This module allows google search results extraction in a simple yet flexible way, and handles captcha solving transparently (through external services or your own hand-made solver).

Out of the box you can target a specific google search host, specify a language and limit search results returned. Extending these defaults with custom URL params is supported through options.

A word of warning: This code is intented for educational and research use only. Use responsibly.

Installation

$ npm install google-search-scraper

Examples

Grab first 10 results for 'nodejs'

var scraper = require('google-search-scraper');

var options = {
  query: 'nodejs',
  limit: 10
};

scraper.search(options, function(err, url) {
  // This is called for each result
  if(err) throw err;
  console.log(url)
});

Various options combined

var scraper = require('google-search-scraper');

var options = {
  query: 'grenouille',
  host: 'www.google.fr',
  lang: 'fr',
  age: 'd1', // last 24 hours ([hdwmy]\d? as in google URL)
  limit: 10,
  params: {} // params will be copied as-is in the search URL query string
};

scraper.search(options, function(err, url) {
  // This is called for each result
  if(err) throw err;
  console.log(url)
});

Extract all results on edu sites for "information theory" and solve captchas along the way

var scraper = require('google-search-scraper');
var DeathByCaptcha = require('deathbycaptcha');

var dbc = new DeathByCaptcha('username', 'password');

var options = {
  query: 'site:edu "information theory"',
  age: 'y', // less than a year,
  solver: dbc
};

scraper.search(options, function(err, url) {
  // This is called for each result
  if(err) throw err;
  console.log(url)
});

You can easily plug your own solver, implementing a solve method with the following signature:

var customSolver = {
  solve: function(imageData, callback) {
    // Do something with image data, like displaying it to the user
    // id is used by BDC to allow reporting solving errors and can be safely ignored here
    var id = null; 
    callback(err, id, solutionText);
  }
};

node-google-search-scraper's People

Contributors

rodrigograca31 avatar thibauts avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.