GithubHelp home page GithubHelp logo

Bao Tree Support about blake3 HOT 9 OPEN

pcfreak30 avatar pcfreak30 commented on August 25, 2024
Bao Tree Support

from blake3.

Comments (9)

oconnor663 avatar oconnor663 commented on August 25, 2024 2

Awesome! This is probably already clear, but just in case: We can't change the chunk size per se without breaking compatibility with BLAKE3. However, we can use larger "chunk groups" in the encoding, effectively pruning the lower levels of the tree, without breaking back-compat. (The encoding isn't compatible if we change the chunk group sizes, but at least the root hash is unchanged.)

from blake3.

lukechampine avatar lukechampine commented on August 25, 2024

Implemented in f9980aa

Chunk sizes are fixed at 1024. I can make that configurable, but the SIMD stuff all assumes 1024 byte chunks, so there will be a performance hit.

No support for slices at this time, but I can add that if it's an important feature.

from blake3.

pcfreak30 avatar pcfreak30 commented on August 25, 2024

Implemented in f9980aa

Chunk sizes are fixed at 1024. I can make that configurable, but the SIMD stuff all assumes 1024 byte chunks, so there will be a performance hit.

No support for slices at this time, but I can add that if it's an important feature.

IIRC @redsolver S5 uses 256 KB chunks and since verifying the tree data needs to be standardized across clients (and portal nodes), as it would change the tree data, this does need to be configurable. The data generated portal side would be downloadable for the client.

But appreciate the work done so far as this is a starting point to migrate. But I do need 256 kb chunks/chunk groups to interop correctly. I did not state that before as I assumed all chunk sizes could just get ported 😅.

Kudos.

from blake3.

oconnor663 avatar oconnor663 commented on August 25, 2024

My fault for taking so long to add chunk group support to the original Bao implementation 😅

from blake3.

redsolver avatar redsolver commented on August 25, 2024

I'm using 256 KiB chunk groups by default, but would like to support other sizes too. For example Iroh is using 64 KiB chunk groups. So right now all of my streaming implementations just download the entire outboard file first and then start streaming (and verify) the file. Long-term it would be nice to switch to a more flexible outboard format that supports efficiently fetching parts of the outboard file down to a specific chunk group size. That would make it possible to generate and host one outboard file that goes down to the 64 KiB level, but a client could only fetch the parts needed down to 256 KiB chunk groups (with range requests) if they don't need to verify smaller chunks (for example when streaming video). But I'm waiting on what the Iroh team comes up with, right now I'll just keep using the default bao outboard format with 256 KiB chunk groups

from blake3.

pcfreak30 avatar pcfreak30 commented on August 25, 2024

Also @lukechampine please make baoOutboardSize public and a New constructor for bufferAt as I have to copy this code for my needs as the standard library does not have any WriterAt implementations IIRC.

I am also unsure if baoOutboardSize should be taking an int64?

To give an idea of what I'm currently working with:

func ComputeTree(reader io.Reader, size int64) ([]byte, error) {
	bufSize := baoOutboardSize(int(size))
	buf := bufferAt{buf: make([]byte, bufSize)}

	_, err := blake3.BaoEncode(&buf, bufio.NewReader(reader), size, true)
	if err != nil {
		return nil, err
	}

	return buf.buf, nil
}

from blake3.

lukechampine avatar lukechampine commented on August 25, 2024

exported in 6e43259

bufferAt is 9 lines of code, just copy it if you need it.

from blake3.

pcfreak30 avatar pcfreak30 commented on August 25, 2024

exported in 6e43259

bufferAt is 9 lines of code, just copy it if you need it.

Yes, I have, just wanted to avoid it.

from blake3.

pcfreak30 avatar pcfreak30 commented on August 25, 2024

Hello,

To bump this issue, my project currently has the following two requirements I request to be supported.

As a secondary nice-to-have, if the existing implementation will not support them, SIMD support on all of this for speed?

Thanks!

from blake3.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.