GithubHelp home page GithubHelp logo

klauspost / compress Goto Github PK

View Code? Open in Web Editor NEW
4.6K 4.6K 303.0 58.87 MB

Optimized Go Compression Packages

License: Other

Go 75.52% Assembly 24.21% Batchfile 0.04% Shell 0.23%
compression decompression deflate go golang gzip snappy zip zstandard zstd

compress's People

Contributors

atetubou avatar aviau avatar dependabot[bot] avatar diogoteles08 avatar dsnet avatar feyrob avatar greatroar avatar ianlancetaylor avatar ianwilkes avatar jacalz avatar jille avatar klauspost avatar lizthegrey avatar mmcloughlin avatar mostynb avatar nigeltao avatar nightwolfz avatar pdecat avatar rohanpai avatar rtribotte avatar saracen avatar shawnps avatar stapelberg avatar teikjun avatar testwill avatar the-alchemist avatar viktoriia-lsg avatar willbicks avatar wojciechmula avatar wyndhblb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

compress's Issues

multiple int overflows on 32 bit arches in s2_test.go

While running tests on Fedora rawhide (32), I get the following test failures:

./s2_test.go:39:4: constant 4294967285 overflows int
./s2_test.go:39:22: constant 4294967295 overflows int
./s2_test.go:40:47: constant 4294967290 overflows int
./s2_test.go:40:65: constant 4294967295 overflows int
./s2_test.go:40:83: constant 4294967295 overflows int
./s2_test.go:41:23: constant 4294967286 overflows int
./s2_test.go:42:23: constant 4294967287 overflows int
./s2_test.go:43:23: constant 4294967288 overflows int
./s2_test.go:44:23: constant 4294967289 overflows int
./s2_test.go:45:23: constant 4294967290 overflows int
./s2_test.go:45:23: too many errors

It looks like similar to #133 .

can't compile compress/fse on 32bit architectures

steps to reproduce

create a Go program.

package main

import _ "github.com/klauspost/compress/fse"

func main() {
}

then build it for 32bit architectures.

$ GOARCH=386 go build main.go 
go: finding github.com/klauspost/compress/fse latest
go: downloading github.com/klauspost/compress v1.7.0
go: extracting github.com/klauspost/compress v1.7.0
# github.com/klauspost/compress/fse
../../pkg/mod/github.com/klauspost/[email protected]/fse/compress.go:22:13: constant 2147483648 overflows int
../../pkg/mod/github.com/klauspost/[email protected]/fse/fse.go:130:25: constant 2147483648 overflows int

Building for arm: constant 2147483648 overflows int

When building a project that uses this as a dependency, I get this error:

$ CGO_ENABLED=1 CC=arm-linux-musleabihf-gcc GOOS=linux GOARCH=arm go build
# github.com/klauspost/compress/fse
../../../../klauspost/compress/fse/compress.go:22:13: constant 2147483648 overflows int
../../../../klauspost/compress/fse/fse.go:130:25: constant 2147483648 overflows int

Similarly in zstd/frameenc.go:

../../../../klauspost/compress/zstd/frameenc.go:108:11: constant 4294967295 overflows int

how to create an empty zlib.reader ?

I write a memory object pool to reuse some zlib.reader objects.
But I can not find an api to create a empty zlib.reader.
If I pass an empty io.Reader to zlib.NewReader, it will return unexpected EOF.
I can only use lazy init or pass an simple zlib compress output to it to work around this bug right now.

timeout in zstd/encoder_test.go:100 on ARM Cortex-A9

The whole zst test takes over 120s total on an ARM Cortex-A9:

ok  	github.com/klauspost/compress/zstd	122.090s

I experimented with setting different timeout values in zstd/encoder_test.go:100 and found that 35 seconds is enough for this particular hardware, but you might want to increase it further to have some headroom for slower hardware.

zstd: Use single buffer for encodes, but copy data

The simplification by using a single buffer instead of holding on to the previous and switching between them seems to be considerably faster.

This also makes longer windows much more feasible

before/after:
file	out	level	insize	outsize	millis	mb/s
enwik9	zskp	1	1000000000	348027537	7499	127.16
enwik9	zskp	1	1000000000	343933099	5897	161.72

10gb.tar	zskp	1	10065157632	5001038195	58193	164.95
10gb.tar	zskp	1	10065157632	4888194207	45787	209.64

SIGILL: illegal instruction

Environment:

$ uname -a
Linux test01 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

go1.4.2

SIGILL: illegal instruction
PC=0x5fd7f2

goroutine 24 [running]:
github.com/klauspost/crc32.ieeeSSE42(0xffffffff, 0xc208123000, 0x1000, 0x1000, 0xc208187a58, 0x0, 0x0, 0x1000, 0xc208123000, 0x1000, ...)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32_amd64.s:122 +0x52 fp=0xc2081c2a78 sp=0xc2081c2a70
github.com/klauspost/crc32.updateIEEE(0x0, 0xc208123000, 0x1000, 0x1000, 0x7000)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32_amd64x.go:39 +0x99 fp=0xc2081c2ad8 sp=0xc2081c2a78
github.com/klauspost/crc32.Update(0xc200000000, 0xc20802c000, 0xc208123000, 0x1000, 0x1000, 0x1000)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32.go:115 +0x85 fp=0xc2081c2b10 sp=0xc2081c2ad8
github.com/klauspost/crc32.(*digest).Write(0xc20813e000, 0xc208123000, 0x1000, 0x1000, 0x1000, 0x0, 0x0)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32.go:121 +0x62 fp=0xc2081c2b48 sp=0xc2081c2b10
github.com/klauspost/compress/gzip.(*Reader).Read(0xc2081a8000, 0xc208123000, 0x1000, 0x1000, 0x1000, 0x0, 0x0)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/compress/gzip/gunzip.go:251 +0x191 fp=0xc2081c2c18 sp=0xc2081c2b48
bufio.(*Scanner).Scan(0xc2081a4080, 0xc208152008)
    /usr/local/go/src/bufio/scan.go:180 +0x688 fp=0xc2081c2d90 sp=0xc2081c2c18
main.parseLogFile(0xc2080f0210, 0x30, 0xc2080e82c0, 0x0, 0x0)
    /home/darren/testproj/main.go:162 +0x426 fp=0xc2081c2f98 sp=0xc2081c2d90
main.func·006(0xc2080f0210, 0x30, 0xc2080e82c0)
    /home/darren/testproj/main.go:298 +0x5a fp=0xc2081c2fc8 sp=0xc2081c2f98
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc2081c2fd0 sp=0xc2081c2fc8
created by main.ProcessLogs
    /home/darren/testproj/main.go:299 +0x37d
...

goroutine 83 [runnable]:
main.func·003()
    /home/darren/testproj/main.go:153
created by main.parseLogFile
    /home/darren/testproj/main.go:159 +0x3eb

rax     0x1000
rbx     0xffffffff
rcx     0xfc0
rdx     0xc208123000
rdi     0x1000
rsi     0xc208123040
rbp     0xc20802c000
rsp     0xc2081c2a70
r8      0x18
r9      0x8000
r10     0x18
r11     0xc2081c4000
r12     0xc2081cbfe8
r13     0x1e
r14     0x0
r15     0x3
rip     0x5fd7f2
rflags  0x10202
cs      0x33
fs      0x0
gs      0x0

Using KP Compress outside of Golang contexts

Hi Klaus – Given your recent performance update, is Compress competitive with its C-based counterparts, like zlib and libdeflate? It would be interesting to see benchmarks against them instead of the standard Go library, since zlib is the standard benchmark and libdeflate is a faster, modern reimplementation of zlib. If Compress beats them, how would we use it, for example with nginx?

zstd write: panic: runtime error: slice bounds out of range

 panic: runtime error: slice bounds out of range
 
 goroutine 1830 [running]:
 github.com/klauspost/compress/zstd.(*Encoder).Write(0xc0277fdc00, 0xc03051c000, 0x9dc4, 0xf6eb, 0xc000034078, 0xc27ba0, 0xe8c7c8)
 	/src/github.com/klauspost/compress/zstd/encoder.go:128 +0x4d4

decompression for streaming data in zip

Hi,

May i ask if the zlib package here supports streaming zip data decompression ?
Imagine there is a streaming of zlib data over a tcp socket, would like to try to use zlib package to do the decompression for data each read from the packet. Pseudo code is as below.

for {
   ...
   err := socketconn.Read(data)
   ...
   zlib.Reader(...) // incrementally does the decompression when new data flows in
}

Thank you !

Improve small payload performance

Noticed that after pulling master, performance of gzip compression is now lower than native Go implementation.

package lib

import (
    "testing"
    "bytes"
    "compress/gzip"
    ogzip "github.com/klauspost/compress/gzip"
    "fmt"
)

var bidReq = []byte(`{"id":"50215d10a41d474f77591bff601f6ade","imp":[{"id":"86df3bc6-7bd4-44d9-64e2-584a69790229","native":{"request":"{\"ver\":\"1.0\",\"plcmtcnt\":1,\"assets\":[{\"id\":1,\"data\":{\"type\":12}},{\"id\":2,\"required\":1,\"title\":{\"len\":50}},{\"id\":3,\"required\":1,\"img\":{\"type\":1,\"w\":80,\"h\":80}},{\"id\":4,\"required\":1,\"img\":{\"type\":3,\"w\":1200,\"h\":627}},{\"id\":5,\"data\":{\"type\":3}},{\"id\":6,\"required\":1,\"data\":{\"type\":2,\"len\":100}}]}","ver":"1.0"},"tagid":"1","bidfloor":0.6,"bidfloorcur":"USD"}],"site":{"id":"1012864","domain":"www.abc.com","cat":["IAB3"],"mobile":1,"keywords":"apps,games,discovery,recommendation"},"device":{"dnt":1,"ua":"Mozilla/5.0 (Linux; U; Android 4.2.2; km-kh; SHV-E120S Build/JZO54K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30","ip":"175.100.59.170","geo":{"lat":11.5625,"lon":104.916,"country":"KHM","region":"12","city":"Phnom Penh","type":2},"carrier":"Viettel (cambodia) Pte., Ltd.","language":"km","model":"android","os":"Android","osv":"4.2.2","connectiontype":2,"devicetype":1},"user":{"id":"325a32d3-1dba-5ffc-82f2-1df428520728"},"at":2,"tmax":100,"wseat":["74","17","30","142","167","177","153","7","90","140","148","164","104","71","19","187","139","63","88","160","222","205","46"],"cur":["USD"]}`)

func BenchmarkNativeGzip(b *testing.B) {
    fmt.Println("BenchmarkNativeGzip")
    for i := 0; i < b.N; i++ {
        b := bytes.NewBuffer(nil)
        w := gzip.NewWriter(b)
        w.Write(bidReq)
        w.Close()
    }
}

func BenchmarkKlauspostGzip(b *testing.B) {
    fmt.Println("BenchmarkKlauspostGzip")
    for i := 0; i < b.N; i++ {
        b := bytes.NewBuffer(nil)
        w := ogzip.NewWriter(b)
        w.Write(bidReq)
        w.Close()
    }
}
/usr/local/Cellar/go/1.7.1/libexec/bin/go test -v github.com/kostyantyn/compressiontest/lib -bench "^BenchmarkNativeGzip|BenchmarkKlauspostGzip$" -run ^$
BenchmarkNativeGzip
    3000        387628 ns/op
BenchmarkKlauspostGzip
    3000        429190 ns/op
PASS
ok      github.com/pubnative/ad_server/lib  2.556s

Int overflow on 32 bits arches

Golang 1.12.6 on i686 and armv7:

Testing    in: /builddir/build/BUILD/compress-1.7.0/_build/src
         PATH: /builddir/build/BUILD/compress-1.7.0/_build/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/sbin
       GOPATH: /builddir/build/BUILD/compress-1.7.0/_build:/usr/share/gocode
  GO111MODULE: off
      command: go test -buildmode pie -compiler gc -ldflags "-X github.com/klauspost/compress/version=1.7.0 -extldflags '-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld '"
      testing: github.com/klauspost/compress
github.com/klauspost/compress
testing: warning: no tests to run
PASS
ok  	github.com/klauspost/compress	0.004s
github.com/klauspost/compress/flate
PASS
ok  	github.com/klauspost/compress/flate	21.662s
github.com/klauspost/compress/fse
FAIL	github.com/klauspost/compress/fse [build failed]
BUILDSTDERR: # github.com/klauspost/compress/fse [github.com/klauspost/compress/fse.test]
BUILDSTDERR: ./compress.go:22:13: constant 2147483648 overflows int
BUILDSTDERR: ./fse.go:130:25: constant 2147483648 overflows int

Does Go 1.7 now match this library's speed?

Hi Klaus – Go 1.7 was released last week and the release notes say that compress/flate compression speed at the default compression level has doubled.

Is this because your code was incorporated to 1.7? Is your library still faster than 1.7?

Relatedly, I only ever see DefaultCompression encoded as -1. When and how does it become the level 6 that it is supposed to be?

Thanks,

Joe

brotli support

It would be nice to have some wrapper or native implementation for golang.

Get rid of the ebook

Hey,

Would you mind getting rid of the ebook? (testdata/Mark.Twain-Tom.Sawyer.txt)

I am packaging compress for Debian and I don't think this book is free software ;). Maybe replace it with gibberish?

zlib decode is not "All heap memory allocations eliminated"

It only have one alloc left when I reuse the reader object:

tracealloc(0xc4205fe030, 0x10, adler32.digest)
goroutine 5 [running]:
runtime.mallocgc(0x10, 0x1162cc0, 0x1, 0x0)
	/usr/local/go/src/runtime/malloc.go:783 +0x4d3 fp=0xc4205ebc38 sp=0xc4205ebb90 pc=0x100fa63
runtime.newobject(0x1162cc0, 0x126c6a0)
	/usr/local/go/src/runtime/malloc.go:840 +0x38 fp=0xc4205ebc68 sp=0xc4205ebc38 pc=0x100ffc8
hash/adler32.New(...)
	/usr/local/go/src/hash/adler32/adler32.go:38
github.com/bronze1man/kmg/vendor/github.com/klauspost/compress/zlib.(*reader).Reset(0xc420074190, 0x126c6a0, 0xc420074140, 0x0, 0x0, 0x0, 0xfd8dd378fc5066be, 0xc4200602f8)
	/xxx/src/github.com/bronze1man/kmg/vendor/github.com/klauspost/compress/zlib/reader.go:176 +0x3df fp=0xc4205ebda0 sp=0xc4205ebc68 pc=0x113bc6f

zstd: Make new frames start concurrently

Currently the start of a new frame requires the previous one to finish decoding.

Since the new frame isn't dependent on the previous it could be advantageous to start decoding it right away.

zstd: Don't copy bytes when encoding

Since encoders operate on a single slice with history+input, we can avoid copying input by writing directly to the input slice of the encoder.

This means that encoderState.filling will be a slice of encoder.hist. We might need to make other changes to ensure that the input remains available even if the input is shifted down, so async block encoding still has it available.

Block encodes could just directly pass in the input, since we will not need any additional history.

zstd: Decoder.Reset deadlock

Starting with commit 50006fb, my simple zstd test program started deadlocking. The bytes.Buffer I pass to Decoder.Reset is less than 1MB, which is what that commit optimizes for.

Thanks for your efforts on a pure-go zstd implementation!

fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
github.com/klauspost/compress/zstd.(*Decoder).Reset(0xc0000f8000, 0x114cc60, 0xc00066c030, 0x7dd3b, 0x10c52)
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/decoder.go:171 +0x1f6
main.zstdCompress(0xc000128000, 0x10c52, 0x1fe00, 0xc00009bbb0, 0x105adfc, 0x114cc80)
	/Users/aaronb/zstd.go:168 +0x1c7
main.compress(0xc000128000, 0x10c52, 0x1fe00, 0xc000016d80, 0x1, 0x1, 0x0, 0x0, 0x1)
	/Users/aaronb/zstd.go:102 +0x6f
main.main()
	/Users/aaronb/zstd.go:57 +0x23f

goroutine 4 [chan receive]:
github.com/klauspost/compress/zstd.(*blockDec).startDecoder(0xc0000922c0)
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/blockdec.go:188 +0x120
created by github.com/klauspost/compress/zstd.newBlockDec
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/blockdec.go:106 +0x155

goroutine 7 [chan receive]:
github.com/klauspost/compress/zstd.(*Decoder).startStreamDecoder(0xc0000f8000, 0xc00005a360)
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/decoder.go:379 +0x272
created by github.com/klauspost/compress/zstd.(*Decoder).Reset
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/decoder.go:154 +0x422
exit status 2

Here is a snippet of my simple test program. zstdCompress is called in a loop on different inputs. The deadlock is occurring on the line with zstdReader.Reset(buf).

type compressedResult struct {
	size           int
	compressTime   time.Duration
	decompressTime time.Duration
}

var zstdWriter, _ = zstd.NewWriter(nil, zstd.WithEncoderConcurrency(1))
var zstdReader, _ = zstd.NewReader(nil, zstd.WithDecoderConcurrency(1))

func zstdCompress(msg []byte) compressedResult {
	var r compressedResult
	buf := &bytes.Buffer{}
	t1 := time.Now()
	zstdWriter.Reset(buf)
	if _, err := zstdWriter.Write(msg); err != nil {
		panic(err)
	}
	if err := zstdWriter.Close(); err != nil {
		panic(err)
	}
	r.compressTime = time.Since(t1)
	r.size = len(buf.Bytes())

	t2 := time.Now()
	if err := zstdReader.Reset(buf); err != nil {
		panic(err)
	}
	out, err := ioutil.ReadAll(zstdReader)
	if err != nil {
		panic(err)
	}
	r.decompressTime = time.Since(t2)

	if !bytes.Equal(msg, out) {
		fmt.Println("bad decompress")
	}

	return r
}

Cannot be used with Modules enabled?

First, I'd like to thank you very much for your work!

I'm unable to use this library when Go Modules are enabled (using Go 1.12):

$ export GOPATH=$(mktemp -d)
$ export GO111MODULE=on

$ cat main.go
package main

import (
	. "github.com/klauspost/compress/zstd"
)

func main() {
}

$ go mod init github.com/fd0/zstdtest

$ go get github.com/klauspost/compress
go: finding github.com/klauspost/compress v1.7.0
go: downloading github.com/klauspost/compress v1.7.0
go: extracting github.com/klauspost/compress v1.7.0

$ go build
go: finding github.com/cespare/xxhash v1.1.0
go: downloading github.com/cespare/xxhash v1.1.0
go: extracting github.com/cespare/xxhash v1.1.0
go: finding github.com/OneOfOne/xxhash v1.2.2
go: finding github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72
# github.com/klauspost/compress/zstd
/tmp/tmp.66p2ytn1dg/pkg/mod/github.com/klauspost/[email protected]/zstd/enc_fast.go:32:15: undefined: xxhash.Digest

As far as I can see the reason is that while your package hasn't adopted Go Modules yet, the xxhash package has and is already at v2.0.0 and the zstd package depends on functionality not available in versions < v2.0.0.

So the following happens:

  • In non-Go-Module mode (GOPATH-mode), the master branch of the xxhash package is used, which is at v2.0.0, so the code compiles
  • In Module mode, my program gets compress at version v1.7.0. The toolchain detects that it depends on github.com/cespare/xxhash. That library has opted in to Go Modules, under which the import path github.com/cespare/xxhash is only valid for versions v0.x.x and v1.x.x. So the toolchain selects v1.1.0. But the API used by the zstd package is only available in v2.0.0, and the build fails.

The only solution (as far as I can see) is to opt into Go Modules:

  • In Module mode, the toolchain sees the requirement for v2.0.0 of xxhash, gets it and the build just works
  • In non-Module mode, the older versions of Go (starting at 1.9.7 and 1.10.3) have been patched to support dropping the /v2 at the end of the import path, so the toolchain will just get the master branch of the xxhash package and it also works. I've verified this using Go 1.12 and Go 1.9.7. Building with older versions (< 1.9.7 or < 1.10.3) fails though, although this could maybe be fixed by the author of the xxhash package.

Please let me know if you'd like me to submit this as a pull request :)

snappy/decode.go length can overflow

snappy/decode.go has this code:

x = uint(src[s-4]) | uint(src[s-3])<<8 | uint(src[s-2])<<16 | uint(src[s-1])<<24
}
// length always > 0
length = int(x + 1)

That comment isn't always true, if you're on 32-bit ints, and so the subsequent "src[s:s+length]" can panic.

panic on Writer.Close, runtime error: index out of range

Unfortunately I don't have a test to reproduce this, but over hundreds of thousands of calls we started seeing a few of these panics:

runtime error: index out of range
goroutine 148112966 [running]:
net/http.func·011()
        /usr/lib/go/src/net/http/server.go:1130 +0xbb
github.com/klauspost/compress/flate.(*compressor).deflateNoSkip(0xc20e431600)
        _vendor/src/github.com/klauspost/compress/flate/deflate.go:594 +0xc68
github.com/klauspost/compress/flate.(*compressor).close(0xc20e431600, 0x0, 0x0)
        _vendor/src/github.com/klauspost/compress/flate/deflate.go:773 +0x49
github.com/klauspost/compress/flate.(*Writer).Close(0xc20e431600, 0x0, 0x0)
        _vendor/src/github.com/klauspost/compress/flate/deflate.go:854 

We never saw this before I updated our snapshot on Nov 2nd. The previous update was Sep 8th.

flate: incorrect encoding for level==2 and Flush calls

This program works for compress/flate but not for "github.com/klauspost/compress/flate"

package main

import (
    "bytes"
    "io/ioutil"
    "log"

    "github.com/klauspost/compress/flate"
)

func main() {
    buf := new(bytes.Buffer)

    w, err := flate.NewWriter(buf, 2)
    if err != nil {
        log.Fatal(err)
    }
    defer w.Close()

    abc := make([]byte, 128)
    for i := range abc {
        abc[i] = byte(i)
    }

    bs := [][]byte{
        bytes.Repeat(abc, 65536/len(abc)),
        abc,
    }
    for _, b := range bs {
        w.Write(b)
        w.Flush()
    }
    w.Close()

    r := flate.NewReader(buf)
    defer r.Close()
    got, err := ioutil.ReadAll(r)
    if err != nil {
        log.Fatal(err)
    }

    want := bytes.Join(bs, nil)
    if bytes.Equal(got, want) {
        return
    }
    if len(got) != len(want) {
        log.Fatalf("length: got %d, want %d", len(got), len(want))
    }
    for i, g := range got {
        if w := want[i]; g != w {
            log.Fatalf("byte #%d: got %#02x, want %#02x", i, g, w)
        }
    }
}

Compression rate is different at the same level old vs new

Hi, I did some local benchmark with different types of data and found that with a same compression level, the compression rate is different using old library vs this library. In particular, when I used old library I can achieve a compression rate with level 2 while using this one I need to use level 5. This makes this library actually slower than the old one if we are targeting at a same compression level. Is this a known issue or maybe I'm testing it wrong?

Integrate golang/go#11030

Hi there,

I just found your project, also skimmed over pgzip. Is zlib support somewhat planned or it's just not worth it? After checking zlib's source code it's just 150 lines of extensive use of compress/flate and hash/adler32.

p.d: Thanks for your work! Having golang/go#11030 resolved in an external library compatible with 1.4 is awesome!

Missing BSD license

Is looks like some of the files are licensed under BSD:

// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

BSD says the following:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

You are not respecting this condition, as there is no BSD license in this repository.

Depending on your intentions, you may want to do the following:

  • relicense klauspost/compress under BSD
  • Include the full license on all headers
  • update the headers and mention a LICENSE.bsd file instead, and then include LICENSE.bsd at the root of the repository.

can't go get this package

Hi guy
after i reinstalled go revel, i got this error
go get github.com/revel/revel

cd e:\go\datafirst\src\github.com\klauspost\compress; git pull --ff-only

fatal: Not a git repository (or any of the parent directories): .git
package github.com/klauspost/compress/gzip: exit status 128
package github.com/klauspost/compress/zlib: cannot find package "github.com/klauspost/compress/zlib" in any of:
E:\go1.6\src\github.com\klauspost\compress\zlib (from $GOROOT)
e:\go\datafirst\src\github.com\klauspost\compress\zlib (from $GOPATH)
e:\go\gopath\src\github.com\klauspost\compress\zlib
can u check for me
thanks guy

zstd: Custom Dictionary Compression Support

First off, thanks for writing a pure Go implementation! My team has wanted to use zstd in our project for a long time now, but have been trying to avoid having any c-Go dependencies.

We have one use-case in particular that would really benefit from the ability to train and use custom dictionaries on the fly.

Is that feature on your roadmap anytime soon? and if not, how challenging do you think it would be for me to try upstream it? I'm happy to contribute some engineering work.

Cheers,
Richie

snappy comparison with upstream need updating

You are obviously free to respond however you want, but FYI the upstream github.com/golang/snappy encoder now implements an asm version of what you call matchLenSSE4, and it (call this "new") now compares favorably with the github.com/klauspost/compress/snappy version (call this "old"):

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     153.77       673.38       4.38x
BenchmarkWordsEncode1e2-8     217.81       428.78       1.97x
BenchmarkWordsEncode1e3-8     282.31       446.89       1.58x
BenchmarkWordsEncode1e4-8     225.73       315.17       1.40x
BenchmarkWordsEncode1e5-8     158.92       267.72       1.68x
BenchmarkWordsEncode1e6-8     206.50       311.30       1.51x
BenchmarkRandomEncode-8       4055.50      14507.66     3.58x
Benchmark_ZFlat0-8            481.82       791.69       1.64x
Benchmark_ZFlat1-8            190.36       434.39       2.28x
Benchmark_ZFlat2-8            6436.37      16301.77     2.53x
Benchmark_ZFlat3-8            368.55       632.13       1.72x
Benchmark_ZFlat4-8            3257.82      7990.39      2.45x
Benchmark_ZFlat5-8            474.40       764.96       1.61x
Benchmark_ZFlat6-8            183.83       280.09       1.52x
Benchmark_ZFlat7-8            170.28       262.54       1.54x
Benchmark_ZFlat8-8            190.70       298.19       1.56x
Benchmark_ZFlat9-8            158.43       247.14       1.56x
Benchmark_ZFlat10-8           581.40       1028.24      1.77x
Benchmark_ZFlat11-8           310.57       408.89       1.32x

For the record, here's the -tags=noasm comparison. The numbers are worse for small inputs but better for large inputs, which I'd argue is still a net improvement:

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     140.02       677.54       4.84x
BenchmarkWordsEncode1e2-8     224.74       86.86        0.39x
BenchmarkWordsEncode1e3-8     274.82       258.34       0.94x
BenchmarkWordsEncode1e4-8     189.95       244.60       1.29x
BenchmarkWordsEncode1e5-8     140.10       185.91       1.33x
BenchmarkWordsEncode1e6-8     169.03       211.16       1.25x
BenchmarkRandomEncode-8       3746.11      13192.30     3.52x
Benchmark_ZFlat0-8            357.12       430.88       1.21x
Benchmark_ZFlat1-8            181.27       276.50       1.53x
Benchmark_ZFlat2-8            5959.15      14075.70     2.36x
Benchmark_ZFlat3-8            312.09       171.85       0.55x
Benchmark_ZFlat4-8            2008.62      3111.51      1.55x
Benchmark_ZFlat5-8            357.46       425.45       1.19x
Benchmark_ZFlat6-8            155.59       189.98       1.22x
Benchmark_ZFlat7-8            149.70       182.01       1.22x
Benchmark_ZFlat8-8            160.04       199.81       1.25x
Benchmark_ZFlat9-8            140.87       175.73       1.25x
Benchmark_ZFlat10-8           415.88       509.88       1.23x
Benchmark_ZFlat11-8           236.50       274.77       1.16x

In any case, the regular case (without -tags=noasm) seems always faster with upstream snappy, on this limited set of benchmarks.

flate: regression on efficiency for very short strings

Using: 9d711f4

Example link: https://play.golang.org/p/3N7YRHAmGO

When compressing very short strings, the KP version of flate outputs strings larger than what the standard library did, which itself outputted strings larger than what zlib did.

Compressing the string "a" on level 6, outputs the following:

zlib: 4b0400
std:  4a04040000ffff
kp:   04c08100000000009056ff13180000ffff

Where zlib is the C library, std is the Go1.6 standard library, and kp is your library. It seems that the KP version uses a dynamic block, rather than a fixed block. If we address this change, we may want to avoid the [final, last, empty block] we currently emit (the 0x0000ffff bytes at the end). That will allow us to produce shorter outputs (like what zlib can produce).

Avoiding the [final, last, empty block] will be beneficial to https://go-review.googlesource.com/#/c/21290/

please tag and version this project

Hello,

Can you please tag and version this project?

I am the Debian Maintainer for compress and versioning would help Debian keep up with development.

Bug in flate?

Hello,

I am trying to use your library here https://github.com/gen2brain/raylib-go/tree/master/rres/cmd/rrem , to embed game resources in file. I have an issue only with DEFLATE (LZ4, XZ, BZIP2 etc. are ok) and both with your library and official compress/flate, not sure how they are related.

There is no issue with .wav data for example, that one is ok after compress/uncompress, only with image.Pix data array that I am compressing. This is what I am getting after uncompress http://imagizer.imageshack.com/img924/9867/EZEAmY.png , and it should be like this http://imagizer.imageshack.com/img924/3088/8TwKUa.png . Unfortunately, I don't have small example to reproduce this behaviour.

compress/flat: how to optimize memory allocations

I'm try to use compress/flat and compress/gzip in http middleware
from gorilla/handlers:
https://github.com/gorilla/handlers/blob/master/compress.go

payload that sent/received is relative small 300-500 bytes.
When i'm profile my code i see:

(pprof) top
Showing nodes accounting for 2920.49MB, 95.58% of 3055.68MB total
Dropped 223 nodes (cum <= 15.28MB)
Showing top 10 nodes out of 76
      flat  flat%   sum%        cum   cum%
 1298.35MB 42.49% 42.49%  2358.66MB 77.19%  compress/flate.NewWriter
  658.62MB 21.55% 64.04%  1060.31MB 34.70%  compress/flate.(*compressor).init
  451.80MB 14.79% 78.83%   451.80MB 14.79%  regexp.(*bitState).reset
  391.68MB 12.82% 91.65%   391.68MB 12.82%  compress/flate.newDeflateFast

does it possible to minimize memory allocations/usage for such kind of payload

Merge into standard library

Good work with these optimizations! Is there a reason that these improvements shouldn't/wouldn't be merged into the standard library?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.