GithubHelp home page GithubHelp logo

Comments (3)

opsdisk avatar opsdisk commented on July 20, 2024

I tried adding some logic to detect an HTTP 503 error and do some backoff throttling, but the underlying request is made in the googlesearch module. You may be able to spread the load between a couple VPS servers or I could add an option to round-robin through a list of HTTP proxies, but again the underlying request is made in the googlesearch module (https://github.com/MarioVilas/googlesearch/blob/master/googlesearch/__init__.py#L124)

from pagodo.

cr4zyd3v avatar cr4zyd3v commented on July 20, 2024

I tried to bypass google bot detection system with random countries like google.cz, google.pl, google.com.br, using rondomized proxies, with random user agents and random sleep time between requests but I got banned.. there are some dorks that will trigger google detection like inurl:".php" or "site:xxx.com" but if you try simple requests like "?id=foo" (with out file type) google will not consider it as a bot.. I did no tried too much so i cant confirm that.. I heard that v3n0m project has a captcha solver but it would be hard to implement in pagodo i guess :d

from pagodo.

opsdisk avatar opsdisk commented on July 20, 2024

Just pushed some updates to master. I've had success running this lately with the new default values...it may take around 4 days to complete though. I'm going to close this for now, unless you have some new data I can work with.

from pagodo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.