GithubHelp home page GithubHelp logo

savetherbtz / zstd-seekable-format-go Goto Github PK

View Code? Open in Web Editor NEW
72.0 3.0 2.0 314 KB

Seekable ZSTD compression format implemented in Golang.

Home Page: https://pkg.go.dev/github.com/SaveTheRbtz/zstd-seekable-format-go

License: MIT License

Go 87.49% Starlark 12.39% Shell 0.12%
golang compression go zstd golang-library

zstd-seekable-format-go's Introduction

License GoDoc Build Status Go Report

ZSTD seekable compression format implementation in Go

Seekable ZSTD compression format implemented in Golang.

This library provides a random access reader (using uncompressed file offsets) for ZSTD-compressed streams. This can be used for creating transparent compression layers. Coupled with Content Defined Chunking (CDC) it can also be used as a robust de-duplication layer.

Installation

go get -u github.com/SaveTheRbtz/zstd-seekable-format-go

Using the seekable format

Writing is done through the Writer interface:

import (
	"github.com/klauspost/compress/zstd"
	seekable "github.com/SaveTheRbtz/zstd-seekable-format-go"
)

enc, err := zstd.NewWriter(nil, zstd.WithEncoderLevel(zstd.SpeedFastest))
if err != nil {
	log.Fatal(err)
}
defer enc.Close()

w, err := seekable.NewWriter(f, enc)
if err != nil {
	log.Fatal(err)
}

// Write data in chunks.
for _, b := range [][]byte{[]byte("Hello"), []byte(" "), []byte("World!")} {
	_, err = w.Write(b)
	if err != nil {
		log.Fatal(err)
	}
}

// Close and flush seek table.
err = w.Close()
if err != nil {
	log.Fatal(err)
}

NB! Do not forget to call Close since it is responsible for flushing the seek table.

Reading can either be done through ReaderAt interface:

dec, err := zstd.NewReader(nil)
if err != nil {
	log.Fatal(err)
}
defer dec.Close()

r, err := seekable.NewReader(f, dec)
if err != nil {
	log.Fatal(err)
}
defer r.Close()

ello := make([]byte, 4)
// ReaderAt
r.ReadAt(ello, 1)
if !bytes.Equal(ello, []byte("ello")) {
	log.Fatalf("%+v != ello", ello)
}

Or through the ReadSeeker:

world := make([]byte, 5)
// Seeker
r.Seek(-6, io.SeekEnd)
// Reader
r.Read(world)
if !bytes.Equal(world, []byte("World")) {
	log.Fatalf("%+v != World", world)
}

Seekable format utilizes ZSTD skippable frames so it is a valid ZSTD stream:

// Standard ZSTD Reader
f.Seek(0, io.SeekStart)
dec, err := zstd.NewReader(f)
if err != nil {
	log.Fatal(err)
}
defer dec.Close()

all, err := io.ReadAll(dec)
if err != nil {
	log.Fatal(err)
}
if !bytes.Equal(all, []byte("Hello World!")) {
	log.Fatalf("%+v != Hello World!", all)
}

zstd-seekable-format-go's People

Contributors

dependabot[bot] avatar renovate-bot avatar savetherbtz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

zstd-seekable-format-go's Issues

progress bar when compressing

I am using the CLI to compress my files. With a rather large file, I was thinking a progress bar would be nice. Would you allow me to add one as a --progress flag?

getting size of compressed contents

It is my understanding that to get the size of the compressed contents, a Seek(0, io.SeekEnd) has to be performed. When this operation is performed, all the entries for the decompressed size appear to have to be sought.

I was trying to use a ReaderAt for HTTP Ranges, and when I did the Seek, the HTTP went denial-of-service because of how many HTTP requests it made.

I wondered if there was a way to implement a skippable frame that would include the overall decompressed size. This way, a single read could be made for the metadata.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.