tomnomnom / meg Goto Github PK

Fetch many paths for many hosts - without killing the hosts

License: MIT License

Go 92.67% Shell 7.33%

meg's Introduction

meg

meg is a tool for fetching lots of URLs but still being 'nice' to servers.

It can be used to fetch many paths for many hosts; fetching one path for all hosts before moving on to the next path and repeating.

You get lots of results quickly, but non of the individual hosts get flooded with traffic.

Install

meg is written in Go and has no run-time dependencies. If you have Go 1.9 or later installed and configured you can install meg with go install:

▶ go install github.com/tomnomnom/meg@latest

Or download a binary and put it somewhere in your $PATH (e.g. in /usr/bin/).

Install Errors

If you see an error like this it means your version of Go is too old:

# github.com/tomnomnom/rawhttp
/root/go/src/github.com/tomnomnom/rawhttp/request.go:102: u.Hostname undefined (
type *url.URL has no field or method Hostname)
/root/go/src/github.com/tomnomnom/rawhttp/request.go:103: u.Port undefined (type
 *url.URL has no field or method Port)
 /root/go/src/github.com/tomnomnom/rawhttp/request.go:259: undefined: x509.System
 CertPool

You should either update your version of Go, or use a binary release for your platform.

Basic Usage

Given a file full of paths:

/robots.txt
/.well-known/security.txt
/package.json

And a file full of hosts (with a protocol):

http://example.com
https://example.com
http://example.net

meg will request each path for every host:

▶ meg --verbose paths hosts
out/example.com/45ed6f717d44385c5e9c539b0ad8dc71771780e0 http://example.com/robots.txt (404 Not Found)
out/example.com/61ac5fbb9d3dd054006ae82630b045ba730d8618 https://example.com/robots.txt (404 Not Found)
out/example.net/1432c16b671043271eab84111242b1fe2a28eb98 http://example.net/robots.txt (404 Not Found)
out/example.net/61deaa4fa10a6f601adb74519a900f1f0eca38b7 http://example.net/.well-known/security.txt (404 Not Found)
out/example.com/20bc94a296f17ce7a4e2daa2946d0dc12128b3f1 http://example.com/.well-known/security.txt (404 Not Found)
...

And save the output in a directory called ./out:

▶ head -n 20 ./out/example.com/45ed6f717d44385c5e9c539b0ad8dc71771780e0
http://example.com/robots.txt

> GET /robots.txt HTTP/1.1
> Host: example.com

< HTTP/1.1 404 Not Found
< Expires: Sat, 06 Jan 2018 01:05:38 GMT
< Server: ECS (lga/13A2)
< Accept-Ranges: bytes
< Cache-Control: max-age=604800
< Content-Type: text/*
< Content-Length: 1270
< Date: Sat, 30 Dec 2017 01:05:38 GMT
< Last-Modified: Sun, 24 Dec 2017 06:53:36 GMT
< X-Cache: 404-HIT

<!doctype html>
<html>
<head>

Without any arguments, meg will read paths from a file called ./paths, and hosts from a file called ./hosts. There will also be no output:

▶ meg
▶

But it will save an index file to ./out/index:

▶ head -n 2 ./out/index
out/example.com/538565d7ab544bc3bec5b2f0296783aaec25e756 http://example.com/package.json (404 Not Found)
out/example.com/20bc94a296f17ce7a4e2daa2946d0dc12128b3f1 http://example.com/.well-known/security.txt (404 Not Found)

You can use the index file to find where the response is stored, but it's often easier to find what you're looking for with grep:

▶ grep -Hnri '< Server:' out/
out/example.com/61ac5fbb9d3dd054006ae82630b045ba730d8618:14:< Server: ECS (lga/13A2)
out/example.com/bd8d9f4c470ffa0e6ec8cfa8ba1c51d62289b6dd:16:< Server: ECS (lga/13A3)

If you want to request just one path, you can specify it directly as an argument:

▶ meg /admin.php

Detailed Usage

meg's help output tries to actually be helpful:

▶ meg --help
Request many paths for many hosts

Usage:
  meg [options] [path|pathsFile] [hostsFile] [outputDir]

Options:
  -c, --concurrency <val>    Set the concurrency level (defaut: 20)
  -d, --delay <val>          Milliseconds between requests to the same host (default: 5000)
  -H, --header <header>      Send a custom HTTP header
  -r, --rawhttp              Use the rawhttp library for requests (experimental)
  -s, --savestatus <status>  Save only responses with specific status code
  -v, --verbose              Verbose mode
  -X, --method <method>      HTTP method (default: GET)

Defaults:
  pathsFile: ./paths
  hostsFile: ./hosts
  outputDir:  ./out

Paths file format:
  /robots.txt
  /package.json
  /security.txt

Hosts file format:
  http://example.com
  https://example.edu
  https://example.net

Examples:
  meg /robots.txt
  meg -s 200 -X HEAD
  meg -c 30 /
  meg hosts.txt paths.txt output

Concurrency

By default meg will attempt to make 20 concurrent requests. You can change that with the -c or --concurrency option:

▶ meg --concurrency 5

It's not very friendly to keep the concurrency level higher than the number of hosts - you may end up sending lots of requests to one host at once.

Delay

By default meg will wait 5000 milliseconds between requests to the same host. You can override that with the -d or --delay option:

▶ meg --delay 10000

Warning: before reducing the delay, ensure that you have permission to make large volumes of requests to the hosts you're targeting.

Adding Headers

You can set additional headers on the requests with the -H or --header option:

▶ meg --header "Origin: https://evil.com"
▶ grep -h '^>' out/example.com/*
> GET /.well-known/security.txt HTTP/1.1
> Origin: https://evil.com
> Host: example.com
...

Raw HTTP (Experimental)

If you want to send requests that aren't valid - for example with invalid URL encoding - the Go HTTP client will fail:

▶ meg /%%0a0afoo:bar
request failed: parse https://example.org/%%0a0afoo:bar: invalid URL escape "%%0"

You can use the -r or --rawhttp flag to enable use of the rawhttp library, which does little to no validation on the request:

▶ meg --verbose --rawhttp /%%0a0afoo:bar
out/example.com/eac3a4978bfb95992e270c311582e6da4568d83d https://example.com/%%0a0afoo:bar (HTTP/1.1 404 Not Found)

The rawhttp library and its use is experimental. Amongst other things it doesn't yet support chunked transfer encoding, so you may notice chunk lengths interspersed with your output if you use it.

Saving Only Certain Status Codes

If you only want to save results that returned a certain status code, you can use the -s or --savestatus option:

▶ meg --savestatus 200 /robots.txt

Specifying The Method

You can specify which HTTP method to use with the -X or --method option:

▶ meg --method TRACE
▶ grep -nri 'TRACE' out/
out/example.com/61ac5fbb9d3dd054006ae82630b045ba730d8618:3:> TRACE /robots.txt HTTP/1.1
out/example.com/bd8d9f4c470ffa0e6ec8cfa8ba1c51d62289b6dd:3:> TRACE /.well-known/security.txt HTTP/1.1
...

meg's People

Contributors

Stargazers

Watchers

Forkers

expertasif mgcfish jaikishantulswani rahulr311295 johnjohnsp1 samyoyo elamaran619 spetr0x parth-db xee5ch permikomnaskaltara wizard2773 attacker34 goulahyane pacenoge allisone-0 bbhunter slooppe modulexcite amalmurali47 cetriext gy741 1ndianl33t nomadh7 0-duke vodafon marciopocebon stfn inosec avasdream thehackerish pager5cx415cx415cx69 lewerkun yuhisern7 gnebbia mswell yangshuangfu jack-dds victoriadrake backwardn c1h2e1 toxicproxy humphreyspinto adamdrake mmg1 0x0verlord alextudu johndoex1 benmcewan7e b3k1rg kapytein jthack j-aka-jey enumeration-tools anthdm talkedkestrel vortexau rbadillap elb4rt0 developer-box nseratt 3lpsy brodieinfosec cyrilmia 0xklaue cgboal shashank18 ell-hol websecresearch shad0w008 shahid1996 samshadali theblackturtle mr-nightskycoder sn4k3-meyer nikammarafiy spartaken sec99 timoles abss0x7tbh jongmartinez ilsani gavz 5l1v3r1 w3bt00lz akashsgupta minnumolu21 atikrahman1 0xbadca7 superuser5 kodjob an4kein hayageek cybercdh saboyadev safisec lakmalrupasinghe martinclauss ykankaya shani2137

meg's Issues

Support for storing response header only

Hi @tomnomnom,

Right now meg store response body for all the request, but sometimes we only look for response header for some header inspection, and we have no option to store the only header or exclude the body, will suggest to have optional flag for the same, as It can improve the speed in case we are only looking for response header.

Feature Request/Question: Accept URL's from Stdin

Hey! I'm not sure if I'm being an idiot, but it seems like there isn't any way to accept URL's from stdin?

I was about to write my tool then I discovered meg, would it be possible to do something like:
crobat -s dyson.com | httprobe | meg paths - out

Is there support for this with bash-fu or is this something I can PR for?

Thanks!

Keep up the great work :)

User Agent is treated differently by some sites

Some sites 302 you to a 'browser not supported' page when they don't recognise the user agent.

Sadly I think that means it'll have to pretend to be Chrome or something like that.

solved :)

Modem Unresponsive

Hello @tomnomnom ,
Whenever I scan a list of 100+ hosts with meg my wifi hang / become unresponsive , have to restart wifi multiple times whenever I scan with meg.

Add proxy support for meg

Hello Tom, I want to use burpsuite proxy with meg, any option to do this ? Many thanks.

Lines in the hosts file must specify the protocol

At the moment you must specify hosts like this:

http://example.com
https://example.org
http://example.net

It's pretty common for tools to accept raw domains as input (and therefore pretty common for people to have them stored in that format), but at the moment meg does now allow that.

There's a couple of options to handle bare domains:

Blindly add https:// and/or http://
Do a request for both HTTP and HTTPS and only add those that respond

The former is definitely easier, but the second will avoid the potential for lots of timeouts (timeouts tie up a worker goroutine for the entire duration of the timeout) at the expense of increased complexity and a couple of extra requests to the hosts.

(feature request) Present failed connections when finished

First of all thanks a lot for so many useful tools!
I was testing an application and I was sure that meg was missing some requests since testing manually with curl I was getting 200.
Reducing the number of threads solve the problem but I was wondering how many requests meg missed in the past and I didn't know it.
I think if meg was presenting the total number of connections failed it'd help to identify this issue quickly.

Response code wrong when redirecting

I see that in #18 , meg gained the ability to follow redirects which is an excellent addition. Unfortunately, in the index file, these redirects are being saved with 200 response codes.

e.g. This is what I am currently getting

out/example.com/{hash} https://example.com/this-url-redirects (200 OK)
out/example.com/{hash} https://example.com/redirected/ (200 OK)

What I think it should show is this:

out/example.com/{hash} https://example.com/this-url-redirects (301 Moved Permanently)
out/example.com/{hash} https://example.com/redirected/ (200 OK)

(feature request) - Multiple status codes in savestatus

Hello,
Can you please add the option to savestatus to save multiple status codes?
Thank you

meg can't follow redirects

At the moment meg will just save the response for HTTP redirects. It would be nice to have an option to follow them and save the resulting responses too.

curl has -L / --location for following redirects so that seems like a reasonable candidate.

This will be much easier for the Go HTTP client than the rawhttp client as it's the default behaviour of the former.

Random user agent

Hi,

Would be nice to have an option to add a random user agent. The one sent currently is easy to identify and block (robots.txt, .htaccess).

Thanks!

Ability to optionally ignore https certificate errors

This would be useful when testing alternate host names / IP ranges.

Display Human-readable size of the http response

Could you show the length of the response in output?
Some sites don't care about best practices and responds with 200 status code to show their custom 404 pages ("sorry not found"), making it hard to detect positive findings.
With size in output we can easily filter out those findings...

Facing issue regarding opening paths file.

Hey,

I use go get to get a copy of meg.

But it's been showing me error failed to open paths file: file ./paths not found

Feature Feedback: Would you be interested in being able to use files of full urls?

Hi, I created a fork (https://github.com/3lpsy/megurl) that ingests a file containing full urls as opposed to a hosts file + a paths file. The fork completely destroys the ability the do the hosts + paths approach but if I made a PR that maintained backwards compatibility but also added the ability to simply pass a pregenerated list of full urls, is this something you'd be interested in? I imagine it'd look like this:

meg paths hosts outputdir
meg -urls-only urls.txt outputdir

If this is not something you're interested in, no worries,

There's no way to make non-compliant requests

One of the things I've looked at in the past is how web applications react to non-compliant requests; such as those with invalid URL encoding (e.g %%0a0a), or those where the request path doesn't start with a slash (e.g. GET @example.com HTTP/1.1).

Go's http library does not provide a way to do that (the parser in the url package chokes on invalid URLs), which is great for 99% of use cases, but not this one.

I've started working on a package that addresses this issue (https://github.com/tomnomnom/rawhttp), so it would be nice to have an option to use it.

rawhttp still has a few issues (e.g. it doesn't yet support chunked encoding, which cloudflare seem awfully keen on), but it works well enough to be included - perhaps with an 'experimental' warning against it.

The main issue with sending non-compliant requests will likely be that the request type is currently:

type request struct {
    method  string
    url     *url.URL
    headers []string
}

But it's not possible to make a *url.URL type for a malformed URL.

The best way around this is probably to split the url property into prefix and suffix components:

type request struct {
    method  string
    prefix  string
    suffix  string
    headers []string
}

And then attach a few methods to request that do things like parse out the hostname that are currently provided by *url.URL. The prefix should pretty much always be parseable by the url package, so it might be a good idea to use that under the hood for getting the hostname etc.

No way to detect Wildcard

Hi Tom,

I think, there should be some measures to detect Wildcard URL's and basis on that stop sending request to those endpoints.

This doesn't impacts on result, But it will rather save time and resource for endpoints that return 200 Ok for every thing.

Personally I am not much familiar with golang :|

But this what I have tried in detecting

creates a counter int variable and empty map
checks if the randomshit is in url and server responds with 200 ok
Increases the counter by 1 and adds the host name to the map as key and counter as value
if count is greater then 3 check if the map has host name with value more then 3 too
finds the index of hostname in hosts slice and then remove it from it

Code compared to original repo

It does works in detecting and removing the domain from hosts slice value.
But since the request worker is running in background and reads from hosts only initially, thus making it of no use.

This is how it looks like now

Even if you don't wish to implement this feature in meg.
Still It would be a of great help, if you can give your inputs on this and on my logic of trying to solve this issue.

Regards,
Bugbaba

failed to open index file for writing: open /index: permission denied

$ meg -d 1000 -v /
$ failed to open index file for writing: open /index: permission denied

At some cases how to solve this @tomnomnom

failed to open paths file: file ./paths not found

The path file is not found please give a solution

"failed to open paths file: file ./paths not found"

wat is the solution to this isue having to set up is really hard

path issue

get the meg binary and move it to /usr/bin
getting ./paths not found

whats the issue ?

I have a question

From the file of hosts do i really need to put http or https?? i can't just put example.com ? because all the subdomain tools such at subfinder amass and sublist3r , they're not going to put http or https://example.com , sorry for my english and i hope you can understand me

Rate limiting is very basic

It'd be a really good idea to rate limit per domain (or maybe per IP) to prevent hammering hosts when there aren't many prefixes.

All output is always saved

It would be nice for big scans to have an option to save only responses with a 200 response code.

There's not enough entropy in the output filenames

At the moment only the request URL is used as an input to the filename hash.

It should really be a hash of the entire output to avoid overwriting files from matching URLs where the output might have changed.

There's no user agent

Meg should have its own user agent so that anyone seeing requests in their logs can see what's making them.

Ideally should be a mozilla-alike user agent for sites that do UA detection etc.

Argument processing takes up nearly half of main()

As the number of arguments and options increases, more and more of main() is taken up with bookkeeping instead of what the program is actually doing.

It'd be good to move the argument processing to a function that returns some kind of config struct.

The prefix/suffix terminology is non-standard

The use of 'prefix' and 'suffix', while technically accurate, is a bit confusing to people trying to understand what the tool really does.

It should be changed to host(s) and path(s) instead

path error while running first time

Hi I am facing this error while running meg first time:
sorry but I am new to this.
root@abc:~# meg
failed to open paths file: file ./paths not found

I am using:
root@abc:~# go version
go version go1.12.7 linux/amd64

go env Output

root@abc:~# go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/work"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build257246342=/tmp/go-build -gno-record-gcc-switches"

Meg Slash issue

by default meg is sending last / after the path wordlist, is there any way to restrict meg to not send the last / after words in path.txt ?

Multiple Headers

Hi, I wanted to test Host Header Injection with Multiple Headers but I found that only one hydride can be sent
Suggest that you add the possibility to send Multiple Headers

Support for single host

Hi @tomnomnom,

Thank you for actively developing this project, I would like to request minor improvement for this project.

Currently, meg supports suffix and prefix format, where we can feed both lists and it will process them accordingly, it would be nice and helpful for the quick test if we can feed single host as well along with existing option.

Example:

meg suffix http://example.com

Thank you.

Deterministic output file names

Would you be interested in a command line parameter that makes meg output deterministic file names? By that I mean that running meg on http://domain.tld/file.ext any number of times would always output the response in the same file.

My use case is that I run my recon in a git repo and I want to be able to diff the responses.

Basically instead of naming the file with a hash of the content, I'd use a hash of the path.

I'm going to implement this anyway in a fork, just asking this to know if you'll be interested in a PR afterwards. :)

net/url: invalid control character in URL

I got this error while using meg on Windows. I guess it because the \r\n at the end of each line of hosts file. Can you fix this?

Timeout isn't controllable

The HTTP timeout is currently hard coded to 10 seconds.

The user should be able to decide what that value is.

There's no way to specify the HTTP verb

Should probably work like cURL's -X option

Add support to report progress for a running request

Add some kind of basic progress status, updating a percentage value, based on hosts X paths left...

 [14:40:12] Starting:
[14:40:14] 302 - http://xxxxxxx.com/a/ output/xxxxxxx.com/02179d82731b29aead42ca2035fbb29c69a3eacd
[14:40:15] 404 - http://xxxxxxx.com/b/ output/xxxxxxx.com/02179d82731b29aead42ca2035fbb29c69a3eace
[14:40:18] 200 - http://xxxxxxx.com/c/ output/xxxxxxx.com/02179d82731b29aead42ca2035fbb29c69a3eacf
**10.00%** - Last request to: http://xxxxxxx.com/c

Support multiple values for an option & negative match if possible

I am not quite sure what the appropriate pattern for this might be, either a comma separated list:
-s 200,403
Or be specified multiple times
-s 200 -s 403

and if possible add a negative match parameter
-e 404,500 ( save all except 404/500 )

I dont know if makes sense but if user explicit use -s or -e parameters, apply those filters to output (terminal/files)

Prefix and suffix files are specified as arg 0 and 1

That makes it a bit clunky to specify just the suffix file, which is actually a pretty common use case.

Might be better to make them flag type params instead.

It's not possible to send POST data

It's not currently possible to send data with POST, PUT request etc.

It would be nice to use -d like curl, but it's already taken by the request delay.

Proxy Settings on Cygwin

Hi,

Thanks for amazing tool!
I'm in a cygwin environment behind a firewall and proxy.
The variables below seems not be working for me to use meg with a proxy.

export HTTP_PROXY="http://myproxy.com:8080";
export HTTPS_PROXY="http://myproxy.com:8080";

I was assuming that meg would load the proxy settings from OS variables above but it doesn't work.
Do you know how can I use meg in the situation described above?

There's no way to know how long a request took

Maybe we should add the response time to the output file

failed to open paths file: file ./paths not found

failed to open paths file: file ./paths not found
help!

There's no way to specify request headers

Should probably match cURL's -H option

You have to save stdout to preserve the index

It'd be good to write an index file by default (appending to it if it already exists).

Contents would just be the same as stdout is currently.

Meg False positives

Look at the first and other request it is looking for http://landings.sportsbet.io/: --> No such host
But their is a host when i look for http://landings.sportsbet.io/

What is the issue??

Error while installing

Hello,

I have a problem like this

root@xxx:~/sf/megplus# go get github.com/tomnomnom/meg
# github.com/tomnomnom/rawhttp
/root/go/src/github.com/tomnomnom/rawhttp/request.go:102: u.Hostname undefined (
type *url.URL has no field or method Hostname)
/root/go/src/github.com/tomnomnom/rawhttp/request.go:103: u.Port undefined (type
 *url.URL has no field or method Port)
/root/go/src/github.com/tomnomnom/rawhttp/request.go:259: undefined: x509.System
CertPool

how to solve this? Thanks.

Rawhttp is flakey

rawhttp still doesn't support chunked responses etc, so it'd be nice to have switchable http engines.

Using the Go http engine by default would be best; only switching to the rawhttp if the request is 'weird'

Getting request failed: unsupported protocol scheme error.

Hi,

I am getting request failed: unsupported protocol scheme error for all the hosts even though they are alive and resolve when accessed through browser.

Error :

request failed: Get sadsa.test.com*: unsupported protocol scheme ""

Can you please suggest.

Thanks

There's no ability to use multiple suffixes

It'd be really useful to be able to provide multiple suffixes and have them all fetched.

To avoid hammering any one site too much it should do something like:

for suffix := range suffix {
    for prefix := range prefixes {
        fetch(prefix + suffix)
    }
    time.Sleep(...)
}

There should be a configurable delay between checking each suffix.