GithubHelp home page GithubHelp logo

acud / multicodec Goto Github PK

View Code? Open in Web Editor NEW

This project forked from multiformats/multicodec

0.0 1.0 0.0 152 KB

Compact self-describing codecs. Save space by using predefined multicodec tables.

License: MIT License

Python 100.00%

multicodec's Introduction

multicodec

Compact self-describing codecs. Save space by using predefined multicodec tables.

Table of Contents

Motivation

Multistreams are self-describing protocol/encoding streams. Multicodec uses an agreed-upon "protocol table". It is designed for use in short strings, such as keys or identifiers (i.e CID).

Protocol Description - How does the protocol work?

multicodec is a self-describing multiformat, it wraps other formats with a tiny bit of self-description. A multicodec identifier is a varint.

A chunk of data identified by multicodec will look like this:

<multicodec><encoded-data>
# To reduce the cognitive load, we sometimes might write the same line as:
<mc><data>

Another useful scenario is when using the multicodec as part of the keys to access data, example:

# suppose we have a value and a key to retrieve it
"<key>" -> <value>

# we can use multicodec with the key to know what codec the value is in
"<mc><key>" -> <value>

It is worth noting that multicodec works very well in conjunction with multihash and multiaddr, as you can prefix those values with a multicodec to tell what they are.

MulticodecProtocol Tables

Multicodec uses "protocol tables" to agree upon the mapping from one multicodec code. These tables can be application specific, though -- like with other multiformats -- we will keep a globally agreed upon table with common protocols and formats.

Multicodec table

The full table can be found at table.csv inside this repo. There's also a sortable viewer.

Adding new multicodecs to the table

The process to add a new multicodec to the table is the following:

    1. Fork this repo
    1. Update the table with the value you want to add
    1. Submit a Pull Request

This "first come, first assign" policy is a way to assign codes as they are most needed, without increasing the size of the table (and therefore the size of the multicodecs) too rapidly.

Implementations

Multicodec Path, also known as multistream

Multicodec defines a table for the most common data serialization formats that can be expanded overtime or per application bases, however, in order for two programs to talk with each other, they need to know before hand which table or table extension is being used.

In order to enable self descriptive data formats or streams that can be dynamically described, without the formal set of adding a binary packed code to a table, we have multistream, so that applications can adopt multiple data formats for their streams and with that create different protocols.

FAQ

Q. Why?

Because multistream is too long for identifiers. We needed something shorter.

Q. Why varints?

So that we have no limitation on protocols.

Q. What kind of varints?

An Most Significant Bit unsigned varint, as defined by the multiformats/unsigned-varint.

Q. Don't we have to agree on a table of protocols?

Yes, but we already have to agree on what protocols themselves are, so this is not so hard. The table even leaves some room for custom protocol paths, or you can use your own tables. The standard table is only for common things.

Q. Where did multibase go?

For a period of time, the multibase prefixes lived in this table. However, multibase prefixes are symbols that may map to multiple underlying byte representations (that may overlap with byte sequences used for other multicodecs). Including them in a table for binary/byte identifiers lead to more confusion than it solved.

You can still find the table in multibase.csv.

Contribute

Contributions welcome. Please check out the issues.

Check out our contributing document for more information on how we work, and about contributing in general. Please be aware that all interactions related to multiformats are subject to the IPFS Code of Conduct.

Small note: If editing the README, please conform to the standard-readme specification.

License

This repository is only for documents. All of these are licensed under the CC-BY-SA 3.0 license © 2016 Protocol Labs Inc. Any code is under a MIT © 2016 Protocol Labs Inc.

multicodec's People

Contributors

stebalien avatar daviddias avatar pjkundert avatar kubuxu avatar jbenet avatar richardlitt avatar vmx avatar whyrusleeping avatar arachnid avatar kumavis avatar hsanjuan avatar samli88 avatar justindrake avatar mikeal avatar acud avatar magik6k avatar vyzo avatar nocursor avatar progval avatar rasmuserik avatar mithgol avatar marten-seemann avatar mkg20001 avatar jeremybanks avatar greglook avatar vandeurenglenn avatar donaldtsang avatar dhruvbaldawa avatar celeduc avatar fluency03 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.