GithubHelp home page GithubHelp logo

samons / pyautogui-pdf-downloader Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zhouhoo/pyautogui-pdf-downloader

0.0 2.0 0.0 3 KB

利用pyautogui实现对炒股软件大智慧里的PDF数据集进行自动下载

License: MIT License

Python 100.00%

pyautogui-pdf-downloader's Introduction

pyautogui-pdf-downloader

利用pyautogui实现对炒股软件大智慧里的PDF数据集进行自动下载

网络爬虫很常见,但是桌面客户端软件一般采用TCP/IP的socket数据传输,想采用获取客户端软件或者数据接口的方式来抓取数据, 就比较麻烦,牵扯到协议的破译等。

作为项目的一部分,我需要先抓取炒股软件大智慧里的数据,大智慧没有提供网页版的数据请求接口,,只能下载客户端软件,并且它为了防止数据被抓, 自己实现了一个下载器,下载器是基于socket传输数据的,并且对数据下载的请求频率做了限制,如果下载速度过快,就会排队,甚至服务器拒绝响应。 下载完成后,会自动调用系统的PDF阅读器来打开文件。这些默认行为都对自动化下载带来了干扰。

针对具体分析,我打算采用pyautogui这个库来模拟人的操作,自动下载数据。

下载大智慧的公司公告数据的操作流程是: 1.翻页,2,点击列表3,点击下载点击,4,关闭PDF阅读器, 5关闭下载页面 6. 滚动鼠标7 继续浏览列表 8. 循环1-7.

为了快速上手这个pyautogui,迅速完成下载任务,首先这个版本实现的是最基本的,关键位置的数据点都是自己事先用另一个脚本找出来, 然后硬编码。 后期可以对关键点进行自动找寻,引入截屏和截屏分析等。

通过这个小任务,为以后的数据抓取积累了另一种思路,利用好这个工具,以后的重复性,有规律性的软件操作,都可以用这来实现,解放了双手,提高 了生产。

pyautogui-pdf-downloader's People

Contributors

zhouhoo avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.