GithubHelp home page GithubHelp logo

hartl3y94 / hakrawler Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hakluke/hakrawler

0.0 0.0 0.0 287 KB

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

License: MIT License

Go 100.00%

hakrawler's Introduction

hakrawler

Twitter Version License

What is it?

hakrawler is a Go web crawler designed for easy, quick discovery of endpoints and assets within a web application. It can be used to discover:

  • Forms
  • Endpoints
  • Subdomains
  • Related domains
  • JavaScript files

The goal is to create the tool in a way that it can be easily chained with other tools such as subdomain enumeration tools and vulnerability scanners in order to facilitate tool chaining, for example:

assetfinder target.com | hakrawler | some-xss-scanner

Features

  • Unlimited, fast web crawling for endpoint discovery
  • Fuzzy matching for domain discovery
  • robots.txt parsing
  • sitemap.xml parsing
  • Plain output for easy parsing into other tools
  • Accept domains from stdin for easier tool chaining
  • SQLMap-friendly output format
  • Link gathering from JavaScript files

Upcoming features

Contributors

  • hakluke wrote the tool
  • cablej cleaned up the code
  • Corben Leo added in functionality to pull links from JavaScript files
  • delic made the code much cleaner
  • hoenn made the code even cleanerer
  • ameenmaali made a bunch of code improvements and bug fixes
  • daehee added the -nocolor flag
  • robre added the -insecure flag

Thanks

  • codingo and prodigysml/sml555, my favourite people to hack with. A constant source of ideas and inspiration. They also provided beta testing and a sounding board for this tool in development.
  • tomnomnom who wrote waybackurls, which powers the wayback part of this tool
  • s0md3v who wrote photon, which I took ideas from to create this tool
  • The folks from gocolly, the library which powers the crawler engine
  • oxffaa, who wrote a very efficient sitemap.xml parser which is used in this tool
  • The contributors of LinkFinder where some awesome regex was stolen to parse links from JavaScript files.

Installation

  1. Install Golang
  2. Run the command below
go get github.com/hakluke/hakrawler
  1. Run hakrawler from your Go bin directory. For linux systems it will likely be:
~/go/bin/hakrawler

Note that if you need to do this, you probably want to add your Go bin directory to your $PATH to make things easier!

Usage

Note: multiple domains can be crawled by piping them into hakrawler from stdin. If only a single domain is being crawled, it can be added by using the -url flag.

$ hakrawler -h
Usage of hakrawler:
  -all
    	Include everything in output - this is the default, so this option is superfluous (default true)
  -auth string
    	The value of this will be included as a Authorization header
  -cookie string
    	The value of this will be included as a Cookie header
  -depth int
    	Maximum depth to crawl, the default is 1. Anything above 1 will include URLs from robots, sitemap, waybackurls and the initial crawler as a seed. Higher numbers take longer but yield more results. (default 1)
  -forms
    	Include form actions in output
  -js
    	Include links to utilised JavaScript files
  -linkfinder
    	Run linkfinder on javascript files.
  -outdir string
    	Directory to save discovered raw HTTP requests
  -plain
    	Don't use colours or print the banners to allow for easier parsing
  -robots
    	Include robots.txt entries in output
  -scope string
    	Scope to include:
    	strict = specified domain only
    	www  = specified domain and "www" subdomain
    	subs = specified domain and subdomains
    	yolo = everything (default "subs")
  -sitemap
    	Include sitemap.xml entries in output
  -subs
    	Include subdomains in output
  -url string
    	The url that you wish to crawl, e.g. google.com or https://example.com. Schema defaults to http
  -urls
    	Include URLs in output
  -usewayback
    	Query wayback machine for URLs and add them as seeds for the crawler
  -v	Display version and exit
  -wayback
    	Include wayback machine entries in output
  -insecure
      Ignore SSL verification

Basic Example

Command: hakrawler -url bugcrowd.com -depth 1

sample output

hakrawler's People

Contributors

ameenmaali avatar cablej avatar daehee avatar delic avatar enfinlay avatar epicfaace avatar gigarashi avatar hakluke avatar hoenn avatar lc avatar omnifocal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.