nytimes / gziphandler Goto Github PK
View Code? Open in Web Editor NEWGo middleware to gzip HTTP responses
Home Page: https://godoc.org/github.com/NYTimes/gziphandler
License: Apache License 2.0
Go middleware to gzip HTTP responses
Home Page: https://godoc.org/github.com/NYTimes/gziphandler
License: Apache License 2.0
I see now GzipResponseWriter
can decide if it wants to gzip the response based on response size.
I suggest also filtering on content-types. The types could be user configured or hardcoded to sane defaults
if _, ok := w.Header()[contentType]; !ok {
// It infer it from the uncompressed body.
w.Header().Set(contentType, http.DetectContentType(b))
}
//Use the content-type information to decide between `w.startGzip()` and `w.startNoGzip()`
IF I'm reading line 166 correctly:
if len(w.buf) >= w.minSize && handleContentType(w.contentTypes, w) && w.Header().Get(contentEncoding) == "" {
Then even if the buffer accumulates to over the minsize but it fails contenttype or content coding is set it will buffer the entire output stream before writing it out at close.
The handlecontenttype and get(contentencoding) should be the very first test on the first write and if it fails then from then on just flush directly to the real response writer and don't buffer anymore since these settings can't change once the writing has started.
ContentTypes(types []string) option whitelists by the full content type. In my case, the encoding and boundary don't matter to me. All I care about is the MIME type. I'd rather not have to declare all the encodings for each MIME type.
Is there an interest in accepting a PR which adds something like one of these:
ContentTypesMimeOnly(types []string) option
MimeTypes(types []string) option
This would use the standard method for parsing the mime type.
(@jprobinson, @adammck do you have thoughts on this?)
Allow the caller to specify the minimum response size (in bytes) for serving a gzipped response.
It would make my use of this library much easier if my code can see whether the library will consider a request for possible compression.
Would making acceptsGzip
-> AcceptsGzip
as a pull request be welcome?
go version go1.12.9 windows/amd64
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\hmc\AppData\Local\go-build
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Users\hmc\go
set GOPROXY=
set GORACE=
set GOROOT=C:\go
set GOTMPDIR=
set GOTOOLDIR=C:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\hmc~1\AppData\Local\Temp\go-build408318050=/tmp/go-build -gno-record-gcc-switches
I startded a http server with gzip handler, with the server's param WriteTimeout
set to a certain value. Then I made a request who's handle time was longer than the WriteTimeout
.
Get some logs or errors about the WriteTimeout error.
Nothing. Only saw that the connection was closed and nothing responsed.
This issue is similar to net/http: ResponseWriter.Write does not error after WriteTimeout nor is ErrorLog used. But the difference is that, gw.Close()
is called in a gzip handler and it returns an error, maybe there is a way to show the error or give it to users? In my scenario, gw.Close()
returned an error of internal/poll.TimeoutError, and it helped me to find my server's issue. So I think showing the error of gw.Close()
helps the users.
In the cases when a Write is performed and minSize has not been reached yet (in essence when the check at [1] is false) the number of written bytes as returned by the method is 0. This means that when using io.Copy a ErrShortWrite is returned and the copy fails (see [2]). Perhaps this behavior should be documented, or Write should pretend len(b) bytes have been written in that case (as they will be eventually)?
[1] https://github.com/NYTimes/gziphandler/blob/master/gzip.go#L107
[2] https://golang.org/src/io/io.go#L400
Users / Importers of gziphandler, please note that the github.com/NYTimes
org (where this repo resides) will be renamed to all lowercase github.com/nytimes
. This branding change is for readability and to conform with the majority of other GitHub entities and open source projects.
When the organization name changes, users will still be able to download, import, and use any of our NYTimes libraries with the old casing (NYTimes). Therefore, there shouldn’t be any action taken on the user’s side as long as you have your dependencies managed through Go Modules or Dep.
However, once the github name changes, we will also be updating the import paths in the actual code of those libraries. Once that code is committed, we will tag a major version release. This way, users know that if they want to know this update is a breaking change and they would have to update their own import paths too.
In Go Modules, you will need to change the import path to lowercase NYTimes
to nytimes
and also ensure the import path includes the major version per Semantic Import Versioning rules.
The proposed date for the rename will happen during the week of March 4, 2019.
Hello there !
https://github.com/NYTimes/gziphandler/blob/master/gzip.go#L115
This causes StatusCode to be 200 on panics.
Defer will close the gzip handler before the http handler to catch the panic
We had the same issue here : https://github.com/BakedSoftware/go-parameters/pull/6/files
Hi! I notice that this project doesn't have any tagged releases. Would you mind adding some SemVer-compatible release tags? It would really, really help those of us using dep
and similar tools.
Can we have the Go package name match casing with the version control URL casing? This subtle difference makes it harder to manage dependencies.
If backwards compatibility is a concern lol, it would actually be easier to rename the GitHub org, which users would reference just once, compared to having to rename every downstream import.
Yes
Right now there's not a way to respond to a HEAD request with the correct gzip headers because the headers aren't added until Write and inside the Write a writer is initialized and upon Close the gzip headers are written and you cannot have a body in a HEAD response.
This is the same issue discussed in golang/go#14975
At the moment, gziphandler
only compresses responses if there is no Content-Encoding
header already set in the response:
https://github.com/nytimes/gziphandler/blob/master/gzip.go#L120-L141
This is ok however the Content Encoding specification contains an identity
directive [1]:
identity
Indicates the identity function (i.e., no compression or modification).
This token, except if explicitly specified, is always deemed acceptable.
This directive essentially means a no-op encoding and should be threated the same as an empty content-encoding in gziphandler
, i.e. candidate for on-the-fly compression.
[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding#Directives
I see that the Content-Length
header is read when a response passes through the middleware, and that it is later deleted before writing the gzipped response to prevent the EOF.
Should a Content-Length
header be set after the size of the compressed content is known? What are the reasons for omitting Content-Length
on gzipped responses?
I'm wondering if the Static files from http.ServeFile
method is also supported. Especially,
It would also be 'nice to have' if the http.ServeFile method would check for the presence of xxx.gz and serve it if present and the accept header includes gzip. This is a trick that nginx does and it greatly accelerates the serving of static css, js etc files because they can be compressed just once during the build process. There's the added benefit too that they are smaller files to read, as well as needing fewer bytes on the wire
thanks
I'd like to use this package with my custom web application wrapper, but I can't match the functionality provided by the NewGzipLevelHandler
without the acceptsGzip
function exposed: https://github.com/NYTimes/gziphandler/blob/fb3533722e14198abe471546c9798fd556531451/gzip.go#L174
How would one use the gzip handler with non-200 responses? The following example seems like it should work but logs http: multiple response.WriteHeader calls
and the response isn't compressed.
package main
import (
"fmt"
"net/http"
"os"
"github.com/gorilla/mux"
"github.com/nytimes/gziphandler"
)
func main() {
r := mux.NewRouter()
r.HandleFunc("/", homeHandler)
r.NotFoundHandler = http.HandlerFunc(notFoundHandler)
http.Handle("/", gziphandler.GzipHandler(r))
http.ListenAndServe(":"+os.Getenv("PORT"), r)
}
func homeHandler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello, World!\n")
}
func notFoundHandler(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNotFound)
fmt.Fprintf(w, "404 - Not Found!\n")
}
This might be related to #5 but I'm mentioning it because I thought #16 might have fixed it. This also seems to be related to gorilla/handlers#83. Although this handler just logs the error but still returns the response successfully (although uncompressed).
Here's some debug information if this helps:
$GOPATH/src/github.com/nytimes/gziphandler master
❯ git rev-parse HEAD
44668d75e46f05932cf7c1c7a375d0765b324a0b
❯ go version
go version go1.7.1 darwin/amd64
Dear Lazyweb: It would be nice if our API looked less like
gz.NewGzipLevelAndMinSize(1, 1024)(myHandler)
and more like
gz.New(myHandler, gz.CompressionLevel(1), gz.MinSize(1024))
See Functional options for friendly APIs. I'm open to other solutions, too.
Currently OPTIONS requests get a "Content-Length: 0" when compression is enabled.
When I have a buffer of a known length and try to write to the gzip handler, I get more bytes written than I originally sent. It think the latest commit is causing that.
b := f.buf[f.off:]
l := len(b) // l is 104870
n, err := w.Write(b) // w is my gziphandler (writer)
// n is 105079
Hi!
We at @wongnai have forked gziphandler internally to add swappable gzip implementation. In production, we swap compress/gzip with our fork of yasushi-saito/cloudflare-zlib which result in 43% less CPU percentage used in the middleware.
We haven't open sourced anything in this project yet, as they require extensive modification to all projects to make it work. I'd like to check with upstream first whether this is something you'll be willing to merge before starting to work on open sourcing it (eg. unfork the go module name).
The changes we made are:
gzipWriterPools, poolIndex, addLevelPool
and their tests into another submoduletype GzipWriter interface {
Close() error
Flush() error
Write(p []byte) (int, error)
}
gzip.Writer
is wrapped in a struct that when closed, also return the writer to the pool.For forked cloudflare-zlib and its integration with gziphandler we may open source it later after the PR made here is merged. We removed its built-in zlib code and just simply link with installed library.
I would like to quickly flush from one of my handlers but the response is too small and it ignores the flush.
As a workaround, I set the minsize to 0. What can I do to force flushing?
Is there any particular reason for not exporting the acceptsGzip
function?
I'm running a static file handler that serves pre-gzipped files, and it would be really useful to have access to the logic of this function for that use case.
More of a suggestion than an issue, apologies.
Great package btw! 👍
Hi,
I try to make the HTTP/2 push mechanism works with gZip compression.
Do you know what needs to be done to make those two mechanisms work together.
Thanks
There are several open issues and no update to the source for years.
What is the state of the package?
Is there a maintained alternative?
The problem comes from the size limitation.
If the template package is used, it can calls Write multiple times with variable length.
In any case the response writer must be the same for the all response inside the same request. So the package as is, is not good.
I made some modifications to prevent using different response writers on successive calls.
But the size limitation become pretty pointless with template or other handlers calling multiple times the Write function.
The go module currently depends on an unreleased version of go. Can we bump the requires go version down to a released version?
When wrapping a handler that supports HTTP Range-Requests (e.g. http.FileServer
), gziphandler relays them as-is, thus violating the HTTP standard and breaking clients.
That means, currently, gziphandler compresses ranges returned by the wrapped handler instead of returning ranges of the compressed output (of the complete wrapped content).
The HTTP standard basically says that Content-Length
is the size of the Content-Encoding
output and that range requests specify ranges into the Content-Encoding
encoded output, as well.
Expected behavior: Either (a) gziphandler filters out the Accept-Ranges: bytes
headers in wrapped handler responses (and any Range:
headers in passed requests), or, (b) it handles Range-Requests on its own and doesn't pass them down to the wrapped handler.
Note that implementing (b) would be kind of complicated, e.g. a range that is smaller than the configured minSize would have to trigger a compression up to the range end.
How to reproduce:
Create a handler like this:
gz_handler, err := gziphandler.NewGzipLevelAndMinSize(gzip.BestCompression, 100)
if err != nil {
log.Fatal(err)
}
http.Handle("/static/", http.StripPrefix("/static/",
gz_handler(http.FileServer(http.Dir("static")))))
Request a full compressed page:
$ curl --header 'Accept-Encoding: gzip' -v -o f http://localhost:8080/static/page.css
[..]
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Content-Encoding: gzip
< Content-Type: text/css; charset=utf-8
< Vary: Accept-Encoding
< Content-Length: 373
[..]
$ file f
f: gzip compressed data, max compression, original size 804
Note how the Content-Length is the size of the compressed content, as expected.
Getting a range:
$ curl --header 'Accept-Encoding: gzip' --header 'Range: bytes=0-300' -v -o a http://localhost:8080/static/page.css
[..]
< HTTP/1.1 206 Partial Content
< Accept-Ranges: bytes
< Content-Encoding: gzip
< Content-Range: bytes 0-300/804
< Content-Type: text/css; charset=utf-8
< Content-Length: 201
[..]
Note the following issues:
And the range isn't a prefix of the previous result:
$ cmp f a
f a differ: byte 11, line 1
Another range request:
$ curl --header 'Accept-Encoding: gzip' --header 'Range: bytes=301-372' -v -o b http://localhost:8080/static/page.css
[..]
< HTTP/1.1 206 Partial Content
< Accept-Ranges: bytes
< Content-Length: 72
< Content-Range: bytes 301-372/804
< Content-Type: text/css; charset=utf-8
[... no Content-Encoding header ...]
Similar issues as before and the range isn't gzip-compressed.
Could the API feature a method for wrapping raw func(w http.ResponseWriter, r *http.Request)
s? This could reduce boilerplate.
Hey, if I take the example in the readme and try to use it with a static directory server I get a net::ERR_CONTENT_LENGTH_MISMATCH
error in the browser. Here's the example with the test code I've added:
package main
import (
"io"
"net/http"
"os"
"github.com/nytimes/gziphandler"
)
func main() {
withoutGz := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
io.WriteString(w, "Hello, World")
})
withGz := gziphandler.GzipHandler(withoutGz)
dirWithoutGz := http.FileServer(http.Dir("./"))
dirWithGz := gziphandler.GzipHandler(dirWithoutGz)
http.Handle("/", withGz)
http.Handle("/test.jpg", dirWithGz)
http.ListenAndServe(":"+os.Getenv("PORT"), nil)
}
If you use the dirWithoutGz
handler then the image returns just fine, but the dirWithGz
handler errors. Am I misunderstanding something?
Here's some debug information if this helps:
$GOPATH/src/github.com/nytimes/gziphandler master
❯ git rev-parse HEAD
44668d75e46f05932cf7c1c7a375d0765b324a0b
❯ go version
go version go1.7.1 darwin/amd64
The current method of inferring the mime type of the uncompressed data is broken when multiple calls are made to Write and the first block of data is small. The current method only considers the first call to Write and not subsequent calls. As the data is being buffered for the minSize test, it makes sense to detect the mime type across the buffer rather than the first fragment.
I noticed this in my fork which has diverged significantly, so my fix and test case won't apply cleanly, but they should provide someone a solid basis for fixing it in this repository, if someone thinks it worthwhile.
Test case was added in: tmthrgd/gziphandler@9855883
Fix was added in: tmthrgd/gziphandler@4324668
http.DetectContentType considers at most 512 bytes which is also the default minSize. So detecting the mime type over the minSize buffer provides much nicer behaviour here.
A test case that applies to this repository is below:
func TestInferContentType(t *testing.T) {
wrapper, _ := NewGzipLevelAndMinSize(gzip.DefaultCompression, len("<!doctype html"))
handler := wrapper(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
io.WriteString(w, "<!doc")
io.WriteString(w, "type html>")
}))
req1, _ := http.NewRequest("GET", "/whatever", nil)
req1.Header.Add("Accept-Encoding", "gzip")
resp1 := httptest.NewRecorder()
handler.ServeHTTP(resp1, req1)
res1 := resp1.Result()
const expect = "text/html; charset=utf-8"
if ct := res1.Header.Get("Content-Type"); ct != expect {
t.Error("Infering Content-Type failed for buffered response")
t.Logf("Expected: %s", expect)
t.Logf("Got: %s", ct)
}
}
Doesn't this still write the footer because of the gzip writer .Close()?
Line 202 in 2f8bb1d
len(w.buf)
is always zero at this point because of previous line: w.buf = nil
If the inner HTTP handler sets a Vary: Accept-Encoding
header, then as the gzip middleware will always add in the same header, the output will have two identical headers.
Here is a failing test case:
func TestEnsureVaryHeaderNoDuplicate(t *testing.T) {
handler := GzipHandler(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Add(vary, acceptEncoding)
w.Write([]byte("test"))
w.(io.Closer).Close()
}))
req := httptest.NewRequest("GET", "/", nil)
req.Header.Set(acceptEncoding, "gzip")
w := httptest.NewRecorder()
handler.ServeHTTP(w, req)
assert.Equal(t, w.Header()[vary], []string{acceptEncoding})
}
I don't think the HTTP spec explicitly disallows you from having the same key/value appearing twice in the headers but it feels tidier to only have one instance of it.
Validation with https://middleware.vet#github.com/NYTimes/gziphandler shows that GzipResponseWriter
is not implementing io.ReaderFrom
for http1.1
Is it possible to implement these only for http1.1 and not for http2 while still reusing response writers?
I am using go version go1.8 linux/amd64, and I have been trying to be able to send some data through gzip. The problem is that even if I send an error code using w.WriteHeader, this library will simply ignore it and send HTTP 200 no matter what. Is there any way that you could fix this?
Hey there,
You wrote a fantastic library and I've been using it with pleasure so far.
I have run into an issue when using the GzipHandler
and gorilla/websocket
(and I saw reports of similar failures for code.google.com/p/go.net/websocket
):
I run into this:
websocket: response does not implement http.Hijacker
It seems the response interface of your library does not implement http.Hijacker which is needed to use websockets.
The fix appears relatively easy and several other libraries have implemented a fix. For example:
https://github.com/gin-gonic/gin/pull/105/files
Other useful thread:
gin-gonic/gin#51
I probably can work around my problem by not using the GzipHandler in bulk for all my routes but I thought you'd be interested in the problem and possibly copy the fixes above.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.