GithubHelp home page GithubHelp logo

Comments (6)

PJK avatar PJK commented on August 26, 2024

Hi @jkbbwr,

My understanding is that the RFC does not actually specify what 'canonical' means, section 3.9 is just a suggestion. Since it's not a 'mandatory' part of the standard and also since libcor was designed when CBOR had virtually no adoption or best practices to draw from, this notion is completely omitted from the implementation.

Just to clarify, libcbor has the following (implicit) behaviors:

  • using cbor_serialize to serialize an item obtained through cbor_load will produce identical bytes as long as there are no
    • denormalized (in the sense of IEEE 754) floats in the input, or
    • non-canonical lengths for maps, arrays, strings, byte strings, or
    • indefinite length items
  • however, the output is not canonicalized according to the RFC. E.g., if the input encoded integer 0 as 0x18 0x00 instead of 0x00, the former will be emitted
  • similarly, idefnite lenght items will be encoded as indefinite lenght
  • data items created using cbor_new_*, cbor_build_* are not canonicalized either
  • tag canonicalization is not handled as that is protocol specific

This means that implementing what is suggested in the RFC would mean

  • implementing integer canonicalization (easy)
  • implementing map sorting (not complicated but messy since the nested items have to be serialized upfront for the sorting)

Both can realistically be implemented either

  • as a part of libcbor (probably cbor_serialize), or
  • as a user function that manipulates cbor_item_ts

In other words, libcbor already allows you to implement the RFC canonicalization (or any similar concept specific to your use-case), adding it to the library would be a convenience feature.

As maintaining correct and safe-ish C code is quite laborious, any extra features on top of what is needed to serialize and deserialize any valid CBOR have to bring enough value to a significant portion of users to offset the cost.

It seems to me that canonical encoding doesn't seem to be widely used (e.g. cbor/cbor.github.io#21, which hasn't attracted much attention), therefore I think the latter approach is preferable for the time being, mainly because the cost of increasing the 'surface area' does not seem justified to me.

If you happen to implement (any, even simpler) canonicalization that you find useful, I'd be more than happy to include it as an example/contributed extra.

In the future, I can see canonicalization becoming a part of CBOR libraries as consensus based on real-world use-cases emerges.

What do you think?

from libcbor.

PJK avatar PJK commented on August 26, 2024

I'm closing this for now, let's get back to it once we know more :)

from libcbor.

joshblum avatar joshblum commented on August 26, 2024

@PJK I'm interested in a canonical serialization option as well. I would like to be able to serialize objects canonically for hashing. Do you know of developments/users that have implemented what is discussed in this issue? Do you know offhand of any other libraries that implement this mode? I agree it wouldn't be difficult to add the implementation as part of the library or a stand alone function, but was curious if it was already done.

from libcbor.

PJK avatar PJK commented on August 26, 2024

@joshblum there is https://github.com/cabo/cbor-canonical, but that probably doesn't help if you are looking for a C library.

from libcbor.

joshblum avatar joshblum commented on August 26, 2024

@PJK thanks for getting back to me! Yeah, I'm looking for a C lib unfortunately, this one might be of use if I have the integers serialized canonically since we may leave maps unsupported initially

from libcbor.

PJK avatar PJK commented on August 26, 2024

I see. One thing you can do is to write a function that that takes a cbor_item_t and builds a new cbor_item_t with the same data in the canonical form. That way you will be able to reuse the data manipulation functionality and serialization/deserialization without dealing with the internals. Would that work for you?

from libcbor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.