GithubHelp home page GithubHelp logo

antispider's People

Contributors

asyncins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

antispider's Issues

勘误#第99页

原书第99页,print('This request is fial'),失败的英文是fail,作者拼错勒!

原书第16页pyteer.py执行报错

原书第16页代码执行时报以下错误:
pyppeteer.errors.BrowserError: Browser closed unexpectedly:
[0310/181827.222752:ERROR:zygote_host_impl_linux.cc(89)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.
系统:Ubuntu 18.04.4 LTS

解决办法:修改相应代码
browser = await launch()
修改为
browser = await launch({'args': ['--no-sandbox']})

有没有9.3.2 实现滑动验证码的完整HTML代码

我参照你的书敲出的代码如下,本来想以txt格式上传,但系统又报错。阁下之前第九章的HTML代码也无效,我自行修改才有正常效果。这个太复杂,不会改。

<title>实现滑动验证码</title> <style> .tracks{ /*滑轨样式*/ width:390px; height:40px; background: #d0c4fe; overflow: hidden; border: 1px solid #c5c5c5; border-radius: 4px; text-align: center; } .hover{ /*滑块样式*/ left: 0px; position: absolute; margin-left: 16px; width: 50px; height: 38px; background: #ad99ff; text-align: center; line-height: 38px; } .hover:hover{ background: #fff; } .slidertips{ /*提示信息样式*/ height: 38px; line-height: 38px; color:#fff; visibility: hidden; } </style> <script> $(function(){ var tracks=document.getElementById('tracks'), sliderblock=document.getElementById('sliderblock'), slidertips=document.getElementById('slidertips'); }) //滑块宽度 var sliderblockWidth=$('#sliderblock').width(); //滑轨长度 var tracksWidth=$('#tracks').width(); var mousemove=false;//mousedown状态 sliderblock.addEventListener('mousedown',function(e){ //监听mousedown事件,记录滑块起始位置 mousemove=true; startCoordinateX=e.clientX //滑块起始位置 }) var distanceCoordianteX=0;//滑块起始位置 tracks.addEventListener('mousemove',function(e){ //监听鼠标移动 if(mousemove){//鼠标点击滑块后才跟踪移动 distanceCoordianteX=e.clientX-startCoordinateX;//滑块当前位置 if(distanceCoordianteX>tracksWidth-sliderblockWidth){ //通过限制滑块位移距离,避免滑块向右移出滑轨 distanceCoordianteX=tracksWidth-sliderblockWidth; }else if(distanceCoordianteX<0){ //通过限制滑块位移距离,避免滑块向左移出滑轨 distanceCoordianteX=0; } //根据移动距离显示滑块位置 sliderblock.style.left=distanceCoordianteX+'px'; } }) sliderblock.addEventListener('mouseup',function(e){ //鼠标松开视为完成滑动,记录滑块当前位置并调用验证方法 var endCoordinateX=e.clientX; verifySliderRetuls(endCoordinateX); }) function verifySliderRetuls(endCoordinateX){//验证滑动结果 mousemove=false;//此时鼠标已松开,防止滑块跟随鼠标移动 //允许误差3像素 if(Math.abs(endCoordinateX-startCoordinateX-tracksWidth)
>>
验证通过!

关于 xpath 与 css 选择器

第四章 信息校验与反爬虫 postman 示例

cookie 反爬虫

`import requests
from lxml import etree

url = 'http://www.porters.vip/verify/cookie/content.html'
resp = requests.get(url)
if resp.status_code == 200:
html = etree.HTML(resp.text)
res = html.cssselect('.page-header h1') #①
print(res)
else:
print('This request is fail.')`

① 处 使用的是css选择器,需要指定cookie才有内容返回,但我没有加cookie,使用xpath(改为 html.xpath('//h1/text()') ) 后就爬取到了主题,为什么?难道 xpath 与 css 选择器在重定向上有原理差异?

splash 运行一直在Initializing...

docker运行splash后访问http://localhost:8050后,在render框里面输入链接https://www.baidu.com,点击后一直在Initializing。。。。

020-11-05 12:11:24+0000 [-] Log opened.
2020-11-05 12:11:24.204062 [-] Splash version: 3.3.1
2020-11-05 12:11:24.204543 [-] Qt 5.9.1, PyQt 5.9.2, WebKit 602.1, sip 4.19.4, Twisted 18.9.0, Lua 5.2
2020-11-05 12:11:24.204648 [-] Python 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609]
2020-11-05 12:11:24.204834 [-] Open files limit: 1048576
2020-11-05 12:11:24.204923 [-] Can't bump open files limit
2020-11-05 12:11:24.308447 [-] Xvfb is started: ['Xvfb', ':639602014', '-screen', '0', '1024x768x24', '-nolisten', 'tcp']
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
2020-11-05 12:11:24.392224 [-] proxy profiles support is enabled, proxy profiles path: /etc/splash/proxy-profiles
2020-11-05 12:11:24.392458 [-] memory cache: enabled, private mode: enabled, js cross-domain access: disabled
2020-11-05 12:11:24.525890 [-] verbosity=1, slots=20, argument_cache_max_entries=500, max-timeout=90.0
2020-11-05 12:11:24.527004 [-] Web UI: enabled, Lua: enabled (sandbox: enabled)
2020-11-05 12:11:24.527598 [-] Site starting on 8050
2020-11-05 12:11:24.527729 [-] Starting factory <twisted.web.server.Site object at 0x7f53dff16cc0>
2020-11-05 12:11:24.528053 [-] Server listening on http://0.0.0.0:8050
2020-11-05 12:11:46.112275 [-] "172.17.0.1" - - [05/Nov/2020:12:11:45 +0000] "GET / HTTP/1.1" 200 7679 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:82.0) Gecko/20100101 Firefox/82.0"
2020-11-05 12:11:46.152861 [-] "172.17.0.1" - - [05/Nov/2020:12:11:45 +0000] "GET /_ui/style.css HTTP/1.1" 200 2591 "http://localhost:8050/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:82.0) Gecko/20100101 Firefox/82.0"
2020-11-05 12:11:51.300247 [-] "172.17.0.1" - - [05/Nov/2020:12:11:50 +0000] "GET /info?wait=0.5&images=1&expand=1&timeout=90.0&url=http%3A%2F%2Fgoogle.com&lua_source= HTTP/1.1" 200 5320 "http://localhost:8050/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:82.0) Gecko/20100101 Firefox/82.0"
2020-11-05 12:11:54.908156 [-] "172.17.0.1" - - [05/Nov/2020:12:11:54 +0000] "GET /favicon.ico HTTP/1.1" 404 153 "http://localhost:8050/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:82.0) Gecko/20100101 Firefox/82.0"
2020-11-05 12:12:05.672961 [-] "172.17.0.1" - - [05/Nov/2020:12:12:05 +0000] "GET /info?wait=0.5&images=1&expand=1&timeout=90.0&url=http%3A%2F%2Fwww.baidu.com&lua_source= HTTP/1.1" 200 5329 "http://localhost:8050/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:82.0) Gecko/20100101 Firefox/82.0"
2020-11-05 12:13:05.673362 [-] Timing out client: IPv4Address(type='TCP', host='172.17.0.1', port=51986)

我尝试了3.3.1跟3.5这两个版本,都是这个问题,操作系统也尝试了centos7.8,ubuntu 20.04/18都尝试过还是这个样子,绝望了,请大佬指导下

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.