GithubHelp home page GithubHelp logo

nine-lives-later / go-xdelta Goto Github PK

View Code? Open in Web Editor NEW
18.0 1.0 8.0 48 KB

Xdelta for Go

License: Other

Go 67.81% Batchfile 1.42% C++ 26.37% C 4.40%
xdelta3 xdelta golang-library patch vcdiff

go-xdelta's Introduction

Xdelta for Go

This library provides a wrapper for the Xdelta library by Joshua MacDonald and others.

Click here to open the GoDoc documentation.

Getting Started

Patches are being created using the encoder, while applying the resulting patches is done by the decoder. The following workflows do exist:

Title Data Flow Description
Encoding (changed file) FROM -> TO => PATCH The encoding reads the new TO file and compares the data to the original FROM file and outputs the resulting PATCH file.
Encoding (new file) TO => PATCH The encoding reads the new TO file outputs the resulting PATCH file.
Decoding (changed file) PATCH -> FROM => TO The encoding reads the PATCH file and applies its operations to the original FROM file and outputs the resulting new TO file.
Decoding (new file) PATCH => TO The encoding reads the PATCH file outputs the resulting new TO file.

There is no process for deleting files (see Best Practices below).

The following example is more or less pseudo-code. (It should be easy enough to understand.)

import "github.com/nine-lives-later/go-xdelta"

options := xdelta.EncoderOptions{
    FileID:    "myfile.ext",
    FromFile:  fromFileReaderSeeker,
    ToFile:    toFileReader,
    PatchFile: patchFileWriter,
}

enc, err := xdelta.NewEncoder(options)
if err != nil {
    return err
}
defer enc.Close()

// create the patch
err = enc.Process(context.TODO())
if err != nil {
    return err
}

The decoder works the same way.

Tracking Progress

The easiest way to track the progress is for encoding/creating to determine how much data has been read from the FROM file. And for the decoding/patching to take the PATCH file's read progress.

Best Practices

  1. Pre-allocate the TO/patched file when applying patches. This will reduce the fragmentation on the file system as it can reserve a spot (on the disk drive) that is large enough for the new file. For this to work, make sure to store the TO/patched file size so it can be read upfront.

  2. Check FROM file hash, before decoding/patching. Ensure that the FROM and PATCH files are correct, before starting the decoding/patching process.

  3. Handle deletion of files! The patching mechanism does not handle the deletion of existing files. Handle this yourself based on your meta-information. Be aware of the difference between an empty file (filesize of 0) and a deleted one.

  4. Do not use the patch file header. It is convenient place to store information, but usually you need security related information like the patch file hash upfront, anyway. Store other meta-information like the TO/patched file size and FROM file hashes, too.

  5. Sign the meta-information (like file sizes and hashes) with an asymmetric encryption key! Do this by calculating a hash and signing that one (never encrypt the file content itself). Sign using a private key and check the signature with the public one. Make sure to have the public key be embedded into your client to prevent man-in-the-middle attacks.

Building

On Windows (with CGO disabled), the project requires the xdelta-lib native C++ library to be built into a DLL file, before it will work. See Native Library below for details. (This is not needed for macOS and Linux.)

To build this project, simply run the following command:

go generate
go build

To run all the tests (including a patch roundtrip test), run the following command:

go test -v

Native Library (Windows only)

On Windows (with CGO_ENABLED environment variable set to 0) the native library has to be built separately and provided in your project's directory.

To build it, run the build script. The native library is saved in the xdelta-lib sub-directory.

./xdelta-lib/build-windows.bat

You can also obtain a pre-compiled version.

Please keep in mind that creating patches with this version is 1:3 times slower (via go test -v):

Function CGO_ENABLED=0 CGO_ENABLED=1
Create Patch 2.62s 1.02s
Apply Patch 0.18s 0.14s

Build Status

The current status of the master branch:

OS CGO_ENABLED Status
Windows AMD64 0 Build status
Windows AMD64 1 Build status
Ubuntu 18 AMD64 1 (default) Build status
macOS AMD64 1 (default) Build status

Authors

The library is sponsored by the marvin + konsorten GmbH.

We thank all the authors who provided code to this library:

  • Felix Kollmann

It is also based on the work by Joshua MacDonald and others.

License

(The Apache 2 License)

Copyright 2019 marvin + konsorten GmbH ([email protected])

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

go-xdelta's People

Contributors

fkollmann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

go-xdelta's Issues

Support for other platforms?

I'm trying to use this on Linux and MacOS, but I see there's only a build script for Windows. Has it been tested on other platforms? Are there any plans to support Linux/MacOS?

If not, could you give pointers on how to add support?

Thanks.

Compressing the patch

Hi,
I have been using bsdiff for a while, however it indexes the patch locations, which either takes a long time for larger files, or (which I've been doing) when I store the index so that I can easily retrieve it rather than generating it every time, It is absolutely massive on disk (~10X the size of the original file.

I am looking for a faster and quicker way to create diffs/patches as for my application they need to happen in (user) real time. I also don't want to have to store an index if I don't have to.

This library therefore looks very interesting. At first glance it also seems much faster. I have put a gist of my demo application using this library, here.

The issue I'm asking about then is the final size of the patch. My patches, using the above code, create patches that are basically identical to the size of the "to file", which in effect is no more efficient than creating copies of the file.

Looking at the original C implementation that I think this is based off, there is a compression option.

However it still seems like I am using it incorrectly, as I would have thought the patch would be just the difference between the original and the final anyway.

From my understanding and how I am using it in that gist is:

  1. I create a file I want as the original (moon.png in my code)
  2. I edit the original file and save that as (moon-to.png)
  3. I specify the name for the patch (moon.patch)
  4. I specify a location to save the applied version. I understand this is original+patch (i.e moon.png with moon.patch applied, giving me moon-to.png)
  5. Run above code.

Am I doing something wrong here? The only thing I note is that you define a context in your full test, and I just use the context.Background()

Nice work, and thanks, any help working out why my patches are 100% the size of the resulting image would be great!

xdelta3.h file not found

Hi, I am on Mac OS and apparently the library can't find the xdelta3.h file, I see the the submodule is included in the repository, any fix?

go-xdelta extremely slow sometimes

I found a case where using go-xdelta is significantly slower than using the xdelta3 executable, to the point of making it unusable. I'd like to share it with you in case you find it useful. I tried to see if I can figure out why myself, but unfortunately I'm not familiar enough with xdelta to do so.

Just because it fit my use case, I downloaded electron 2.0.17 and electron 5.0.12 and tried to patch the electron executable alone, from one version to the other. I generated the vcdiff using the xdelta3 executable, and then tried to apply it both with that and using a simple go program using go-xdelta:

package main

import (
	"context"
	"fmt"
	xd "github.com/konsorten/go-xdelta"
	"os"
	"path/filepath"
)

func main() {
	sourceFile := os.Args[1]
	filename := filepath.Base(sourceFile)
	diffFileArg := os.Args[2]
	patchOutputFileArg := os.Args[3]

	sourceFileReaderSeeker, err := os.OpenFile(sourceFile, os.O_RDWR|os.O_CREATE, 0755)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	patchOutputFileWriter, err := os.OpenFile(patchOutputFileArg, os.O_RDWR|os.O_CREATE, 0755)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	diffFileReader, err := os.Open(diffFileArg)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	var decoderOptions = xd.DecoderOptions{
		FileID:    filename,
		FromFile:  sourceFileReaderSeeker,
		ToFile:    patchOutputFileWriter,
		PatchFile: diffFileReader,
	}

	decoder, err := xd.NewDecoder(decoderOptions)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}
	defer decoder.Close()

	fmt.Printf("Processing %v\n", filename)
	err = decoder.Process(context.TODO())
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}
	fmt.Printf("Processing of %v ended\n", filename)
}

I timed both with the linux command line utility time, and xdelta3 took 0,7 seconds, while my program with go-xdelta took 13 minutes and 5 seconds.

At first I thought this was because the files were large (from ~80MB to ~110MB), but then I tried the same with the debian amd64 netinst and i386 netinst CD images, which clock at around 300-400MB, and the results were very different: xdelta3 again took around 0,5 seconds to complete the patch, while my program above took 11 seconds. Still a lot slower, but not as prohibitively so.

I wonder if it relates to the fact that the electron executables were executables, although since xdelta3 performs very well, I expect it's not an issue with the library, but probably with this wrapper.

Anyway, this makes the wrapper not usable for me at the moment, but thank you very much for producing it anyway, hope this gets figured out and fixed at some point! Let me know if you need help reproducing it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.