sayem314 / hooman Goto Github PK
View Code? Open in Web Editor NEWhttp interceptor to hoomanize cloudflare requests
Home Page: https://www.npmjs.com/package/hooman
License: MIT License
http interceptor to hoomanize cloudflare requests
Home Page: https://www.npmjs.com/package/hooman
License: MIT License
Hi, I'm trying to get HTML from these links:
https://igds.info/
https://ilgeniodellostreaming.wf/?s=kung+fu+panda
(old site: https://ilgeniodellostreaming.llc/?s=kung+fu+panda)
Unfortunately these sites have different Cloudflare Protection and no program can get HTML from there.
NB:
These sites can only be accessed via Italian IP, but attached are the errors recovered (with HTML no bypassed).
ilgeniodellostreaming.txt
igds.txt
Thank you in advance and have a nice day : )
Nothing is being printed when I invoke this code.
node: v10.14.0
hooman: v1.2.5
OS: Windows 10
const hooman = require('hooman');
(async () => {
try {
const response = await hooman.get('https://kissmanga.com/Manga/Grand-Blue');
console.log(response.body);
console.log('I succeed');
//=> '<!doctype html> ...'
} catch (error) {
console.log(error.response.body);
console.log('I failed');
//=> 'Internal server error ...'
}
})();
Hello im try to bypass a uam + bot fight mode, and can't, hooman can't do it?
Tested URL : https://audiograb.net/
//edit : url some times use uam + captcha (not for all ips, if im using vpn work just on uam)
How can use hooman to bypass uam + captcha in same time, can do it?
write here
return new Promise((resolve, reject) => {
hooman.get(url, {
agent: {
https: proxy,
},
cloudflareRetry: 10,
})
.then(response => {
resolve(response);
})
.catch((error) => {
console.log(error.response.body);
let obj_v = proxies.indexOf(proxy);
proxies.splice(obj_v, 1);
console.log(error.message);
return reject(error.message);
});
});
}
Hi there,
I'm currently trying to find a library able to scrape pages behind a cloudflare protection. I have a simple example set up in my code, but every time I run the code, it returns with:
Please turn JavaScript on and reload the page.
Here is my example code:
(async () => { try { const response = await hooman.get( "https://www.slamjam.com/en_BE/man/footwear/sneakers/low/nike-special-project/dunk-low-sp-sneakers/J188431.html" ); console.log(response.body); //=> '<!doctype html> ...' } catch (error) { console.log(error.response.body); //=> 'Internal server error ...' } })();
I can't seem to figure out how to fix this, I thought hooman would take care of the javascript challenge? Any help in the good direction is much appreciated!
Hello, cloudflare has updated new challenge for captcha, hooman is patched for moment, will hooman updated? will continue this project?
write here
// paste code here
Failure to return body of a new captcha challenge page. Resulting in error catching.
Captcha page source: https://gist.github.com/christophernarciso/df3a3a8a0602b8426c42d0f64d8f276a
Should pass through the new captcha page with no issue and return the body of the page.
// Node environment testing
(async function main() {
try {
const url = 'https://osbot.org/forum/topic/157064-excellent-vorkath/';
const source = await hooman.get(url);
console.log(source.body);
//=> '<!doctype html> ...'
} catch (error) {
console.log(error.response.body);
//=> 'Internal server error ...'
}
})();
Hey,
I used codemanki/cloudscraper
but author decided to archived his project and I looking for a alternative.
I tracked a lot of websites and cloudscraper usually works fine but for one website is a problem. But when the project was archived, it isn't good prognostic for future. So I tested your solution.
When I try load it: https://vitals.com - I get error:
{
"name": "HTTPError",
"timings": {
"start": 1589285452689,
"socket": 1589285453194,
"lookup": 1589285453195,
"connect": 1589285453195,
"secureConnect": 1589285453195,
"upload": 1589285453197,
"response": 1589285453359,
"end": 1589285453395,
"phases": {
"wait": 505,
"dns": 1,
"tcp": 0,
"tls": 0,
"request": 2,
"firstByte": 162,
"download": 36,
"total": 706
}
}
}
In the stack is:
HTTPError: Response code 403 (Forbidden) at PromisableRequest.request.once (/root/cloudflare-bot/node_modules/got/dist/source/as-promise/index.js:124:28) at process._tickCallback (internal/process/next_tick.js:68:7)
Could you tell me, if your library should work fine for this example and I'm doing wrong something or your library don't support it.
Hello,
Do you think it could be possible to expose a more affordable interface in order to prevent get got
locked by this package?
This is the thing I have on mind:
const got = require('hooman')(require('got')
I am trying the following and get 415 (Unsupported Media Type) error.
const got = require("hooman")
got('https://c.pxhere.com/images/11/49/74e4a31de6abe70227fa1cb22d37-1612083.jpg!d')
.then(response => {
console.log(response);
})
.catch(error => {
console.error(error);
});
Okay, the problem is the following, my script works perfectly, bypasses, but consumes a lot of balance from the 2captcha account, in 1 minute it consumed me $ 3
How could I make him consume less, it seems he takes money for every request
My old script based on cloudscraper, used only 0.02 to solve captcha, once, but using hooman seems to consume a lot for each request
I'm sorry for my bad english
this is my code
I get error
node:30584) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict
(see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 611)
// FIXED
I have noticed that hooman's hook that sets the referer for the request sets it as the url you are requesting too. This is okay behaviour for some sites that don't vigourously check these things but for example I would get a 403 response for a post request that also had a unique query param token in it and after looking at the response I saw that it is because the referer is set to the current request URL.
This is the only instance it has occured to me but is likely causing some other peoples 403 responses and wanted to flag it incase you choose to change the behaviour or make other people aware
Getting error code: 1020 on certain CF protected sites
write here
Test two websites:
https://www.apotea.se/ => works
https://www.shoepalace.com/ => error code 1020
write here
// paste code here
Be able to require web page (http://www.javlibrary.com/)
write here
get an error
write here
const Humanoid = require("@subns/humanoid-js");
let humanoid = new Humanoid();
humanoid.get("http://www.javlibrary.com/")
.then(res => {
console.log(res.body) // ...
})
.catch(err => {
console.error(err)
})
Hello, i dont know what is wrong, can't bypass hcaptcha, just im waiting 1-3 min and nothing
my exemple code
const hooman = require('hooman');
const url= process.argv[2];
return new Promise((resolve, reject) => {
hooman.get(url, {
captchaKey: 'key',
});
})
.catch((error) => {
console.log(error.message);
return reject(error.message);
});
}
Need help to improve my code to bypass uam, if is posible
have tried like
hooman.get({
url: url,
agent: {
https: proxy,
},
cloudflareRetry: 5,
})
.catch((error) => {
console.log(response.body);
but i get error
This is my code
//fixed added captcha and removing proxy (with proxy just get captcha error)
Hello, really cool stuff here, but since the cloudflare updated their method, this cant work anymore,
Is there any chance you will update in the near future??? thx a lot man, u have a good day.
I want to use hooman
with Python. Unfortunately I have little experience with JS and at the moment, I don't think it is possible to pass header (user agent, cookie, ...) to hooman
. It would be nice if that were possible in the future.
/fixed
have updated the new version and work
I can't use iot at all.
Afer some digging, it seems this is related to got
, and got says it's related to something else.
Looking at the code, I think it's because got
(and this module), is not compatible with newest node version.
I am trying to scrap my malt.fr profile and this appeared to be a good client to bypass it's cloudflare security.
Running hooman.get('https://www.malt.fr')
is throwing error: connect ECONNREFUSED 127.0.0.1:443
I expect to get a code 200
to this request.
await hooman.get('https://www.malt.fr')
With a fresh install of NPM and Node plus latest code here, nothing works. I believe CloudFlare just released a new set of challenge scripts which are much more thorough and similar to Google's reCAPTCHA v3 - e.g. better at detecting real browsers. Now all this module does is dump out the "Checking your browser before accessing ..." page.
jsdom and user-agents bloat this package to 6.2 MB minified.
I suggest making the user agent user supplied and replacing jsdom with cheerio or something like this.
This site has a custom IM_UNDER_ATTACK page:
https://ogusers.com/member.php?action=login
It's not always enabled, but it should be enabled now.
Is there any way to support this?
thank sir.
/fixed
Would be nice to see this in typescript
Hello mate!
I saw this line:
Line 38 in 11d0a13
Isn't better to use URL
? I'm having a problem that it adds pathname
twice after each other that fails my request. Like /my/pathname/mypathname
instead of /my/pathname
. I'm not sure this is the right way to do it or if it's a different puzzle...
const { URL } = require('url');
const { origin: baseUrl } = new URL(response.url);
Best regards!
I installed all modules and set it up on a CentOS7 server, but after running node hooman.js - it does nothing. Any suggestions/help?
Here is my code : `var fs = require('fs')
const request = require('request');
const emojis = require('./emojis.json');
const hooman = require('hooman');
var download = async function(uri, filename, callback) {
var file = fs.createWriteStream(filename);
var r = hooman.stream(uri).pipe(file);
r.on('error', function(err) { console.log(err); });
r.on('finish', function() { file.close(callback) });
};
for (let i = 0; i < emojis.length; i++) {
nom = emojis[i].image.slice(emojis[i].image.indexOf("emoji/") + 6, emojis[i].image.length);
download(emojis[i].image, "./emojis/" + nom, function(err) {
console.log(err);
})
}`
I try to bypass uam, but hooman just get html content and spamming my vps
proof https://prnt.sc/sqd3u9
tested url https://botflare.xyz
write here
It seems that the hooman is outdated, it can't get around the new cloudflare challenge, I tested it on several websites, none of it can be outdated, except sometimes, when the protection is not active
Okay, so the new cloudflare challenge only appears when it detects malicious requests, or multiple attempts from the same IP address, or the server is already attacked, etc.
The new challenge checks the 'browser' to see if you are real or bot, after which you will have to get the token for uam, (the cookie) and after you have obtained the cloudflare cookie displays a free captcha that must be completed.
Hooman can't detect both at once, I've already tried this on a few urls, below is a list
https://cyberwarblog.xyz - bypassed failed
https://fatality.win/ - bypassed failed
https://botflare.xyz/ - bypassed failed
Beware, cloudflare doesn't always display the captcha request, so sometimes bypass works, but 90% don't
One option would be to use extra puppeteer to get around the new challenge
A exemple u can see here https://github.com/JimmyLaurent/cloudflare-scraper
write here
const hooman = require('hooman');
(async () => {
try {
const response = await hooman.get('https://sayem.eu.org');
console.log(response.body);
//=> '<!doctype html> ...'
} catch (error) {
console.log(error.response.body);
//=> 'Internal server error ...'
}
})();
but also tried with captcha
print cloudflare page
print html page
(async () => {
try {
const response = await hooman.get('http://www.javlibrary.com/ja');
console.log(response.body);
//=> '<!doctype html> ...'
} catch (error) {
console.log(error.response.body);
//=> 'Internal server error ...'
}
})();
Hello!
I'm getting 403 on all requests when running my code on a server machine. Tested on Windows server and also Ubuntu 18. Running on mac or windows 10 will get good responses.
const got = require("hooman");
got("https://www.grosbasket.com/")
.then(response => {
console.log(response.body);
})
.catch(error => {
console.error(error);
});
What could make the difference on these scenarios?
Thank you!
With new version my script stoped working, we need to update or any bug ?
Just spamming console with html and get error Response code 403 (Forbidden)
here is my script
https://pastebin.com/raw/WAc5LLjS
Hello, we tested and proxy function don;t work with http2, also useragents just use 1 useragent, im testing on my website, all requests coming with POST method, and 1 useragent
What about tls v1.2 / v1.3 and encoding, gzip, br etc
Just a sugestion, can have also support for anti-captcha ?
The hooman can bypass uam and captcha now, can't bypass ''banned'' protection ?
//edit
see u have updated, now proxy function work but spaming with html etc, and stop sending requests and bypass
`deleted
Response code 403 (Forbidden)
deleted
Response code 403 (Forbidden)
deleted
Response code 403 (Forbidden)
deleted
Response code 403 (Forbidden)
deleted
Response code 403 (Forbidden)
CF updated their challenge is there will be an update?
Hi, I want to recommend that the vm library be used because vm2 does not work properly in an Electron application that uses a webpack. There are an awful lot of errors that are not easy to fix. Thanks
Hi @sayem314 ,
thanks for your support and contribution.
I'm trying to access to this site from a (central) US IP
https://streamingcommunity.cafe/
Unfortunately doing the GET request I retrieve an error due the 403 HTTP code (in this moment the browser resolve the jschallenge)
Can you help me?
Thanks in advanced and have a nice day!
Hi,
I'm trying to get html from this site:
https://www.netfreex.club/
but I receive an error. It looks like the challenge can't be solved.
I can see through the html the challege cf_chl_jschl_tk , but this site:
https://ilgeniodellostreaming.tw/?s=casa+di+carta (accessible only by Italian IP)
use the same challenge, and here it works.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.