GithubHelp home page GithubHelp logo

1414044032 / sina_spider Goto Github PK

View Code? Open in Web Editor NEW
29.0 2.0 9.0 4 KB

新浪爬虫,基于Python+Selenium。模拟登陆后保存cookie,实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。

Python 100.00%
spider sina simulation

sina_spider's Introduction

Sina_Spider

新浪爬虫,基于Python+Selenium。模拟登陆后保存cookie,实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。

环境与工具:

Python:3.6 + selenium + firefox_Driver firfox_Driver 驱动下载地址: https://pan.baidu.com/s/1WGo7kVGsfRlE2XFvQRPHJA https://github.com/mozilla/geckodriver/releases 注意驱动与浏览器版本对应 下载驱动后。可以放在 C:\Python36\Scripts 目录下面。不然需要配置环境变量,把驱动目录添加进Path。 需要安装火狐浏览器:官网下载。

main 中修改为自己的账户密码即可。注意看浏览器打开的窗口登录时,是否有验证码。经过测试,邮箱登录一般不会弹出验证码。手机号码会弹出。异地登录会弹出。 出现验证码,可以在 driver.find_element_by_css_selector("div.info_list:nth-child(6) > a:nth-child(1)").click() 之前time.sleep(20) 让驱动暂时暂停,手动输入验证码(20秒内)。之后就可以正常获取到cookie。获取的cookie 保存为txt文件,放在同一级目录中,再次登录就不需要模拟登陆了。

sina_spider's People

Contributors

1414044032 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sina_spider's Issues

疑问

很厉害呀
从selenium模拟登陆成功后获取到的cookie放在requests,get()请求里好像是会失败的,想请问这种该怎么解决尼

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.