GithubHelp home page GithubHelp logo

Comments (24)

sayem314 avatar sayem314 commented on June 15, 2024 1

@andress134 can you be more specific what you are trying to achieve? Btw I guess your question is not related to this issue, for further discussion please open new issue with more details. Also your code was unreadable so I had to edit it a little.

Here is how you use proxy btw as per your code example.

const fs = require('fs'),
  got = require('hooman'),
  path = require('path'),
  HttpsProxyAgent = require('https-proxy-agent');

const target = process.argv[2],
  time = process.argv[3],
  req_per_ip = process.argv[4];

let proxies = fs
  .readFileSync(process.argv[5], 'utf-8')
  .replace(/\r/gi, '')
  .split('\n')
  .filter(Boolean);

function send_req() {
  let proxy = proxies[Math.floor(Math.random() * proxies.length)];
  proxy = new HttpsProxyAgent('http://' + proxy);

  return new Promise((resolve, reject) => {
    got(target, {
      agent: {
        https: proxy,
      },
      cloudflareRetry: 10,
    })
      .then((response) => {
        console.log(response.body);
        resolve(response);
      })
      .catch((error) => {
        let obj_v = proxies.indexOf(proxy);
        proxies.splice(obj_v, 1);
        console.log(error.message);
        return reject(error.message);
      });
  });
}

Proxy docs: https://github.com/sindresorhus/got#proxies
Proxy module: https://www.npmjs.com/package/https-proxy-agent

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024 1

@andress134 the mentioned sites works fine on tests.

image

And please don't continue any further discussion about this in this issue, create a new issue and I'm happy to assist you.

Site is returning Cloudflare challenge on me on the browser and I have verified that hooman successfully bypassed it.

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024 1

@sayem314 thank you for your help. Looking forward to try this with .stream(). :-)

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

use responseType as buffer

Docs:
https://github.com/sindresorhus/got#responsetype

Example:

const { body}  = await got(url, { responseType: 'buffer' });

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

You can also pipe stream to file.

const { createWriteStream } = require("fs");
const got = require("hooman");

(async () => {
  const image = createWriteStream("image.jpg");
  got
    .stream("https://c.pxhere.com/images/11/49/74e4a31de6abe70227fa1cb22d37-1612083.jpg!d")
    .pipe(image);
})();

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

Hm, still getting 415 (Unsupported Media Type) :/

const got = require("hooman");

let url = "https://c.pxhere.com/images/11/49/74e4a31de6abe70227fa1cb22d37-1612083.jpg!d"

got(url, { responseType: 'buffer' }).then(response => {
    console.log(response.body)
})

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

@linaspasv can you please try stream example I posted later? Also, this library is just a wrapper around got to bypass Cloudflare js-challenge, request related issues are best to first test with got library and open issue over there.

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

With the stream example I get 503 (Service Unavailable) error.

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

The image I am trying to access is under cloudflare anti-ddos protection page. I have tried your script with the regular HTML page and it works perfectly but fails for a direct image download. I am not sure if it's issue with this library or got library.

image

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

I have locally tested it and it works fine for me. In fact, I have gone ahead and added this on the test and it seems to be passing as well. f9aa0e5

Edit: Travis > https://travis-ci.org/github/sayem314/hooman/jobs/685101120

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

Do you get challenged by cloudflare? When I run the same code (see below) on an image that has no cloudflare protection it works perfectly. pxhere.com does not give a challenge page for residential IP addresses but when you run this on some servers you get challenged. :-)

const got = require('hooman');
const fs = require('fs');

(async () => {

    //let resource = 'https://c.pxhere.com/images/11/49/74e4a31de6abe70227fa1cb22d37-1612083.jpg!d';
    let resource = 'https://explorecams.com/storage/photos/GdmaI9FIbe_1600.jpg';

    got.stream(resource)
        .on('error', err => console.log(err))
        .pipe(
            fs.createWriteStream('image.jpg')
        )
})();

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

I have tested on residential ip where their main domain did not throw js-challenge and so I have gone ahead and tested on my own Cloudflare challenge activated domain where js-challenges are always thrown.

Test code:

const test = require("tape");
const scrape = require("hooman");
const { writeFileSync, statSync } = require("fs");

const jsChallengePage = "https://cf-js-challenge.sayem.eu.org";

// Test image download
test("sample image download", async t => {
  console.time("image download");
  const { body } = await scrape(jsChallengePage + "/images/background.jpg", {
    responseType: "buffer"
  });
  console.timeEnd("image download");

  // Write to file
  t.ok(Buffer.isBuffer(body));
  writeFileSync("image.jpg", body);

  // Check image size
  const { size } = statSync("image.jpg");
  t.equal(size, 31001);
});

Note that I have removed other tests for fair testing result.

Test result with console log for easy debugging:
image

  • Btw travis test are done from datacenter ip.

Here is updated test code and results:
9776a07

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

A possible fix for you. I'm not sure what's causing you issue but give this a try:

const got = require('hooman');
const fs = require('fs');

(async () => {
    await got('https://explorecams.com') // init cookie

    let resource = 'https://explorecams.com/storage/photos/GdmaI9FIbe_1600.jpg';
    got.stream(resource)
        .on('error', err => console.log(err))
        .pipe(
            fs.createWriteStream('image.jpg')
        )
})();

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

No luck. Also, I have tried to run the same without hooman (see the source code below) and I end up with the same Response code 503 (Service Temporary Unavailable) error.
image

I have also tried to just curl and I get the challenge page code... so it seems your plugin is not triggered to solve the challenge page when I run this particular URL.

image

const got = require('got');
const fs = require('fs');

(async () => {
    let resource = 'https://c.pxhere.com/images/11/49/74e4a31de6abe70227fa1cb22d37-1612083.jpg!d';

    got.stream(resource)
        .on('error', err => console.log(err))
        .pipe(
            fs.createWriteStream('image.jpg')
        )
})();

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

Can you send me the HTML of the challenge page?

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

Okay, so it seems my challenge page ends up in .on('error') and your plugin does not pick it somehow. The challenge page for IMAGE is the same as for a regular HTML page and it works perfectly with your library!

const got = require('hooman');
const fs = require('fs');

(async () => {
    let resource = 'https://c.pxhere.com/images/11/49/74e4a31de6abe70227fa1cb22d37-1612083.jpg!d';

    got.stream(resource)
        .on('error', err => console.log(err.response.body))
        .pipe(
            fs.createWriteStream('image.jpg')
        )
})();

I get the follow output now.
cf-challenge.txt

Also, attaching received headers for that page.
image

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

It seems this might be the issue why your hook at afterResponse is not being triggered and I am seeing the following results.

image

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

I see .streams() are unsupported unfortunately. But did you try with responseType: 'buffer' as shown in test.js of hooman? Btw your HTML is okay, hooman should be able to solve it without issue.

https://github.com/sayem314/hooman/blob/master/test.js#L41-L48

from hooman.

linaspasv avatar linaspasv commented on June 15, 2024

First of all to make your library work with 'buffer' one needs to convert buffer to the string inside the afterResponse hook first.

if (
          // If site is not hosted on cloudflare skip
          response.statusCode === 503 &&
          response.headers.server === "cloudflare" &&
          response.body.includes("jschl-answer")
        ) {
            let body = response.body instanceof Buffer
                ? response.body.toString()
                : response.body

            const data = await solve(response.url, body);

While this part is resolved I still get 415 (Unsupported Media Type) error when this line runs -

return instance({ ...response.request.options, ...data });

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

Convert is not necessary on hooks since it should match only when it's an HTML page. Something must be wrong on your end, I have tested it on multiple datacenter IP and VPN, and for me, it works every time. Something must be wrong on your end :(

As you can see Travis CI tests are passing as well which are done from shared datacenter IP and my domain throws Cloudflare challenge regardless of how clean your IP is with a custom filter.

from hooman.

andress134 avatar andress134 commented on June 15, 2024

// Fixed

from hooman.

andress134 avatar andress134 commented on June 15, 2024

// fixed

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

Closing this issue as I was unable to re-produce. BTW I was also able to get .stream() to work, I will update the instruction on the readme.

from hooman.

sayem314 avatar sayem314 commented on June 15, 2024

@linaspasv docs updated for stream https://github.com/sayem314/hooman#pipe-stream

from hooman.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.