GithubHelp home page GithubHelp logo

Comments (8)

austinabell avatar austinabell commented on May 25, 2024 1

If the deserialization order is checked, this will break a lot of things if we update.

Why would this be the case? It'll only break cases where you manually construct a Vec<u8> with a representation that is different from what you get when you serialize map. We do sort on serialization.

ohhh my apologies, you are correct that this wouldn't break. I'm not quite sure there why I assumed this sorting didn't happen, and skipped over that detail in the description.

Do I have it correctly that you would like to remove the impl for binary heap and add the check in deserialization of HashMap and HashSet?

A negative I see is that there will be extra costs to verifying order on deserialization that isn't necessary in most cases, especially if a contract developer is certain of the ordering.

Also, I can see there being an issue with the fact that serialized bytes that were constructed in the wrong order will be asymmetric where the bytes after deserialize/serialize aren't equal, but there are also cases where one might want to serialize a vec of elements and then deserialize as a set/map to just use unique values. In this case it would be breaking and is a pattern that might be expected to work since it does with other serialization protocols and the ordering constraint might not be clear.

I'm for this change by the way, just want to make sure we consider the counter points.

from borsh-rs.

matklad avatar matklad commented on May 25, 2024

@austinabell am I correct in my assessment that no one actually tries to borsh-serialize binary heaps?

from borsh-rs.

austinabell avatar austinabell commented on May 25, 2024

@austinabell am I correct in my assessment that no one actually tries to borsh-serialize binary heaps?

Not that I am aware of, but that doesn't mean it isn't possible that there is usage in the community. I don't think it's a huge concern though because if they absolutely need this type to be serialized, they can just wrap it and create their own implementation that mimics this old behaviour or something.

As for HashMap, it's actually used a lot around the SDK and even in community unfortunately because it used to be promoted as the way to store small maps (still even used in our examples). If the deserialization order is checked, this will break a lot of things if we update. The issue also is just that within the NFT standard, HashMap is used as part of the NFT standard trait.

I did try to purge all HashMap usages from SDK before, can't remember why I didn't end up PRing, possibly the breaking change to NFT, but it's probably worth the breaking change

from borsh-rs.

matklad avatar matklad commented on May 25, 2024

If the deserialization order is checked, this will break a lot of things if we update.

Why would this be the case? It'll only break cases where you manually construct a Vec<u8> with a representation that is different from what you get when you serialize map. We do sort on serialization.

from borsh-rs.

matklad avatar matklad commented on May 25, 2024

A negative I see is that there will be extra costs to verifying order on deserialization that isn't necessary in most cases, especially if a contract developer is certain of the ordering.

Yeah, that's true -- if you are certain of the ordering, and you would love to avoid incurring the costs, you need to use Vec<_> directly.

Also, I can see there being an issue with the fact that serialized bytes that were constructed in the wrong order will be asymmetric where the bytes after deserialize/serialize aren't equal

Yeah, that's the positive motivation for this change. Specifically, the current behavior might cause pretty bad bugs when you require to sign/encrypt messages, and also want to enforce idempotence: you can encode the same message in two ways, and get different signatures.

In this case it would be breaking and is a pattern that might be expected to work since it does with other serialization protocols and the ordering constraint might not be clear.

Yeah, that's true -- you'd have to write your own type to deserialize with duplicate elimination. But that's the reason why borsh exists -- in a sense, borsh is "bincode + canonicity", if you don't need canonicity, you can use bincode.

from borsh-rs.

frol avatar frol commented on May 25, 2024

I have implemented that BinaryHeap when I was adding BTreeMap support in #6. I should not have done that. I vote for removing support for BinaryHeap since indeed I don't think anyone would actually use it.

from borsh-rs.

austinabell avatar austinabell commented on May 25, 2024

Yeah, that's true -- if you are certain of the ordering, and you would love to avoid incurring the costs, you need to use Vec<_> directly.

The only part to this that I do want to stress is that people currently using any map or set in storage in the SDK, or otherwise, will be incurring more costs subtly without any change on their end.

I wonder if it's worth considering having this order checked deserialization as a feature. The negative to having it as a default is just that you can't actually hit the case of asymmetric (de)ser unless you use another type like Vec. This adds overhead for every usage of these, which maybe the canonical attribute is imperative, but seems like it's more of a reason to check the ordering only when you are accepting external input like the case of signatures on a set.

from borsh-rs.

dj8yfo avatar dj8yfo commented on May 25, 2024

taking this issue

from borsh-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.