GithubHelp home page GithubHelp logo

saucelabs / forwarder Goto Github PK

View Code? Open in Web Editor NEW
186.0 186.0 11.0 3.98 MB

Forwarder is a production-ready, fast MITM proxy with PAC support. It's suitable for debugging, intercepting and manipulating HTTP traffic. It's used as a core component of Sauce Labs Sauce Connect Proxy.

Home Page: https://forwarder-proxy.io

License: Mozilla Public License 2.0

Makefile 0.94% Go 96.83% JavaScript 0.88% Dockerfile 0.10% Shell 0.44% Python 0.80%
http-proxy https-proxy linux macos mitm mitmproxy pac proxy rewriting socks5 upstream windows

forwarder's People

Contributors

alexh-sauce avatar alexplischke avatar amckenzie132 avatar arunkbharathan avatar choraden avatar dependabot[bot] avatar mmatczuk avatar thalesfsp avatar waggledans avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

forwarder's Issues

ExampleNew only works when not logging anything

In example test it's documented to update logging level to get more info. Unfortunately doing that breaks the tests.

--- FAIL: ExampleNew (4.02s)
got:
component=proxy output=console level=debug timestamp=2022-09-08T10:08:50+02:00 message=Target/end server started @ http://user1:[email protected]:59872
component=proxy output=console level=debug timestamp=2022-09-08T10:08:50+02:00 message=PAC template parsed: 
function FindProxyForURL(url, host) {
  if (
    dnsDomainIs(host, "intranet.domain.com") ||
    shExpMatch(host, "(*.abcdomain.com|abcdomain.com)")
  )
    return "DIRECT";

  return "PROXY 127.0.0.1:49837; DIRECT";
}

component=proxy output=console level=debug timestamp=2022-09-08T10:08:51+02:00 message=PAC server started @ http://user:[email protected]:59873
component=proxy output=console level=debug timestamp=2022-09-08T10:08:51+02:00 message=Basic auth setup for proxy @ http://u123:[email protected]:49130
component=proxy output=console level=debug timestamp=2022-09-08T10:08:51+02:00 message=Listening on 127.0.0.1:49130
component=proxy output=console level=debug timestamp=2022-09-08T10:08:52+02:00 message=Basic auth setup for proxy @ http://u456:[email protected]:49837
component=proxy output=console level=debug timestamp=2022-09-08T10:08:52+02:00 message=Listening on 127.0.0.1:49837
component=proxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=Client is using http://u123:[email protected]:49130 as proxy
component=goproxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=[001] INFO: Got request / 127.0.0.1:59872 GET http://127.0.0.1:59872/
component=proxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=GET 127.0.0.1:59875 -> http 127.0.0.1:59872 59872
component=goproxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=[001] INFO: Received response 200 OK
component=proxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=127.0.0.1:59875 <- 127.0.0.1:59872 200 OK (4 bytes)
component=goproxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=[001] INFO: Copying response to client 200 OK [200]
component=goproxy output=console level=debug timestamp=2022-09-08T10:08:53+02:00 message=[001] INFO: Copied 4 bytes to client error=<nil>
200
body
want:
200
body

FAIL

proxy: localhost proxying exposes proxy local services

When ProxyLocalhost is disabled, user can call services listing on localhost on the proxy machine.
This is exactly what we can see in #31 logs, local proxy does not hit upstream, instead it uses its own localhost to deliver the result.

Related to #31

Detach logger interface from implementation

Each relevant component (Struct?) should allow specifying a logger it uses.
Providing the logger implementation is responsibility of main package.
At the moment we are accessing logger.Get randomly it creates a troublesome coupling in da code.

Proxy crashes with "invalid memory address or nil pointer dereference"

v.0.1.19 crashes

2022/04/20 13:45:37 http: panic serving 127.0.0.1:64338: runtime error: invalid memory address or nil pointer dereference
goroutine 625 [running]:
net/http.(*conn).serve.func1()
        /usr/local/Cellar/go/1.18/libexec/src/net/http/server.go:1825 +0xbf
panic({0x4ace600, 0x5390eb0})
        /usr/local/Cellar/go/1.18/libexec/src/runtime/panic.go:844 +0x258
github.com/saucelabs/forwarder/pkg/proxy.New.func3(0x0, 0xa?)
        /Users/danslov/go/pkg/mod/github.com/saucelabs/[email protected]/pkg/proxy/proxy.go:708 +0x97
github.com/elazarl/goproxy.FuncRespHandler.Handle(0x4b5aca0?, 0xc000372690?, 0xc0008eedc0?)

Looks like it's related to elazarl/goproxy#318

Allow to configure http transport

HTTP transport - the proxy HTTP client transport is not configured. In practice we want to be able to manage connection pool sizes timeouts etc.

Forwarder shouldn't proxy when PAC returns DIRECT

  • Start Forwarder listening on 8081
  • Start another instance of Forwarder with a PAC file listening on 8080.
function FindProxyForURL (url, host) {
    if (shExpMatch(host, '*.example.com')) {
        return 'DIRECT';
    }
    return 'PROXY localhost:8081';
}

This PAC file expects that ALL requests to www.example.com will go direct.

Execute:

$ curl -x localhost:8080 http://www.example.com
$ curl -x localhost:8080 http://www.httpbin.org
$ curl -x localhost:8080 http://www.example.com

The first request to http://www.example.com will not go via the upstream proxy (:8081) as it's configured in the PAC file.
The next request to http://www.httpbin.org will be proxied via the upstream proxy as expected.
The next request to http://www.example.com will be proxied via the upstream proxy even though Forwarder logs that the request is direct!!!!!

setupHandlers is odd and buggy

The setupProxyHandlers function that configures goproxy uses a strange method where everything is always configured on per request basis.

It looks like it creates race condition...
This code called for every request clearly changes the server state.

// setupUpstreamProxyConnection forwards connections to an upstream proxy.
func setupUpstreamProxyConnection(ctx *goproxy.ProxyCtx, uri *url.URL) {
	ctx.Proxy.Tr.Proxy = http.ProxyURL(uri)

Also, HandleConnectFunc returns a nil ConnectAction while it should accept the connect - this code path seems not to be tested.

Normally we should be using ReqCondition like

// Typical usage:
//	proxy.OnRequest(UrlIs("example.com/foo"),UrlMatches(regexp.MustParse(`.*\.exampl.\com\./.*`)).Do(...)

those can be also custom interface implementations that allow for more dynamic bhv.

logging: add support for request ID

Requirements:

  • All log lines related to processing of a request should have request ID
  • Request ID should be read from X-Request-ID header
  • If missing it should be generated

Add forwarded headers

ATM no headers are added to indicate the request origin. This should be controlled by a flag.

Make so called integration test regular unit test

The make test and make integration-test run the same set of tests.
The only difference is that in the "integration test" the TestNew uses https://httpbin.org/status/200 instead of the local server...

			if os.Getenv("FORWARDER_TEST_MODE") == "integration" {
				targetServerURL = "https://httpbin.org/status/200"

				if tc.args.dnsURIs != nil {
					dnsURIs = tc.args.dnsURIs
				}
			}

This could be changed to a test where we crate synthetic dns server or modify /etc/hosts

validation: DNS addresses MUST NOT be host name

Consider this

// dial makes a new connection to the provided server (which must be
// an IP address) with the provided network type, using either r.Dial
// (if both r and r.Dial are non-nil) or else Dialer.DialContext.
func (r *Resolver) dial(ctx context.Context, network, server string) (Conn, error) {
	// Calling Dial here is scary -- we have to be sure not to
	// dial a name that will require a DNS lookup, or Dial will
	// call back here to translate it. The DNS config parser has
	// already checked that all the cfg.servers are IP
	// addresses, which Dial will use without a DNS lookup.
	var c Conn
	var err error
	if r != nil && r.Dial != nil {
		c, err = r.Dial(ctx, network, server)
	} else {
		var d Dialer
		c, err = d.DialContext(ctx, network, server)
	}

Since we replace dial function in Resolver we must make sure that we would not end up doing a name resolution.

Configuration URI parsing and validation

Currently forwarder accepts string based configuration. Relevant configuration values are passed as URL encoded strings. The strings are validated using proxyURI validation. The validation produces poor error messages as noted in #55.

The SC code actually works with url.URL, it would be beneficial to:

  • Replace the URI strings in configuration with *url.URL
  • Add custom validation for url.URL that would report actionable error messages
  • Add cobra value type that handles conversion of string to url.URL on field level and possibly calls the custom validations - so we'd have a customizable type that would allow to run custom validations from the package

Add custom headers

Add possibility to add custom headers. The cli should use --header -H format, for reference see curl manual.

Add metrics

Add observability by providing information on processed requests, error rate and so on.
Details need to be specified.

AutomaticallyRetryPort config option is harmful

AutomaticallyRetryPort is a feature that allows to change desired port if the one specified by user is occupied.
This is a dangerous feature that prevents the system from failing fast.
If user wants to have always up bhv and read port from the log she can set port 0.

ExampleNew does not proxy anything

The promise

// client -> protected local proxy -> protected pac server - connection setup -> protected upstream proxy -> protected target.

The reality

component=proxy output=console level=trace timestamp=2022-09-08T14:17:09+02:00 message=Not proxifying request to localhost URL: http://127.0.0.1:62270/

It's easy to fix by adding ProxyLocalhost: true.

Server configure HTTP timeouts

Malicious user may drain server TCP connection pool by opening connections and never writing anything if no timeout is specified.

proxy.go:620:12: G114: Use of net/http serve function that has no support for setting timeouts (gosec)

Handle signals and allow for graceful stop

The command shall intercept SIGINT and SIGTERM.
Forwarder should expose means to gracefully stop the proxy.

  • Stop accepting new connections
  • Wait for ongoing connections to to close
  • Specify grace period in the configuration file

Proxy double run protection mechanism has a race condition

Proxy.Run function has a mutex to protect the run state, clearly if I call go p.Run twice and both runs get past the check we'd endup having 2 active servers if they can be started.

	// Do nothing if already running.
	p.mutex.RLock()
	if p.State == Running {
		logger.Get().Traceln("Proxy is already running")

		return
	}
	p.mutex.RUnlock()

This functionality could be better achieved by sync.Once or documented and delegated to caller.

Logging in hot path

Debug logging requests in hot path should be guarded with implicit log level check or a dedicated flag.
Due to inefficient logger implementation the log msg needs to be created only to be discarded later.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.