GithubHelp home page GithubHelp logo

luzonghaoa / mini-spider Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zhangyunhao116/mini-spider

0.0 1.0 0.0 1.67 MB

简单、实用的爬虫工具,仅需四步创建属于你的爬虫程序!

License: Other

Python 99.36% Shell 0.64%

mini-spider's Introduction

Mini-Spider

PyPI platform license

Mini-Spider是一个实用的爬虫工具,它的意义在于快速获得你所要的资源,而不用去关注诸如爬虫构造、数据存储、网络环境、语言实现等一系列的事情。现在你只需要简单的几个命令,就可以创建一个爬虫,并完成你的任务!

使用mini-spider,你仅需要两步即可创建属于你自己的爬虫!(大部分时候)

特性

  • 网页自动提取资源并根据算法分类(包括完整url和所有html标签内容)
  • 根据资源自动生成提取器
  • 自定义提取器以及Host数据
  • 自动将提取内容加入相应数据库
  • 自动分类下载,断点续传
  • 数据库导入和导出

简单地说,你只需要几个命令就可以爬取你想要的资源!

安装

安装前注意:

  • 只依赖于python 3.x ,不兼容pyhon 2.x

  • 本项目不需要任何第三方依赖。

下载整个项目,切换到本目录,在终端中执行

$ python3 setup.py install

或者,使用pip下载

$ pip3 install mini-spider

如何使用

示例

这里演示使用三个命令创建爬虫,后使用两个命令完成全部任务。

示例目标:提取这里作者发布的所有图片

example

当前版本

Ver 0.0.4 : 基本功能测试阶段。

mini-spider's People

Contributors

zhangyunhao116 avatar zyunh avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.