GithubHelp home page GithubHelp logo

spider's Introduction

spider

just a spider

0×01 说明: 为了方便信息安全测评工作,及时收集敏感地址(初衷是爬取api地址),所以写了这么个小工具。两个简单的功能(目录扫描和url地址爬取)

0×02 使用参数: python spider.py -u url -s api -o output.txt -t thread_number #通过爬虫 python spider.py -u url -s dir -f dict.txt -o output.txt #通过目录扫描

0×03 部分函数说明: 防止因末尾斜线、锚点而重复爬取(http://www.example.com、http://www.example.com、http://www.example.com/index.html#xxoo) image

爬取规则: 第一个无法爬取页面注释中的地址(),第二个无法爬取相对路径和php?id=等类型的地址,古结合两种规则,并排除图片视频类的地址,最后再去重 image

补全相对地址、防止越界(可爬取子域名,其他地址除外),并验证地址是否能正常访问 image

地址池 image

爬取功能,虽然使用了多线程,但还是比较慢,输出结果是爬取完毕的地址 image

目录扫描和输出到文件 image

spider's People

Contributors

silience avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.