GithubHelp home page GithubHelp logo

pouya-eghbali / sia Goto Github PK

View Code? Open in Web Editor NEW
131.0 7.0 8.0 6.18 MB

Sia - Binary serialisation and deserialisation

License: Apache License 2.0

JavaScript 100.00%
serialization encoding binary compiler sia protocol-specification deserialisation

sia's Introduction

Codacy Badge codecov

Sia

Sia - Binary serialisation and deserialisation with built-in compression. You can consider Sia a strongly typed, statically typed domain specific binary language for constructing data. Sia preserves data types and supports custom ones.

Please note the Sia specification and implementation isn't final yet. As a core part of Clio programming language, Sia evolves with Clio. It is made to make fast RPC calls possible.

Why

I needed a fast schema-less serialization library that preserves type info and is able to code/decode custom types. I couldn't find one. At first I wanted to go with a JSON with types solution but it didn't work out, so I created my own.

Performance

This repository contains a pure JS implementation of Sia, on our test data Sia is 66% to 1250% faster than JSON and serialized data (including type information for all entries) is 10% to 30% smaller than JSON. Sia is faster and smaller than MessagePack and CBOR/CBOR-X. It is possible to use lz4 to compress Sia generated data even more and still be faster than JSON, MessagePack and CBOR-X.

Sia

Tests are run on a 2.4 GHz 8-Core Intel Core i9-9980HK CPU (5 GHz while running the benchmarks) with 64 GB 2667 MHz DDR4 RAM. Node version 16.4.2, Mac OS 11.5.1. 100 loops each serialization library. To run the benchmark suite you can run npm run benchmark and to run the tests you can run npm run test.

Specification

Read specs.md.

Install

To install the Node library and save it as a dependency, do:

npm i -S sializer

Documentation

WIP

The Node Sia library exports 5 items:

const { sia, desia, Sia, DeSia, constructors } = require("sializer");
  • sia(data) function serializes the given data using the default parameters.
  • desia(buf) function deserializes the given buffer using the default parameters.
  • Sia(options) class makes an instance of Sia serializer using the given options.
  • DeSia(options) class makes an instance of Sia deserializer using the given options.
  • constructors is an array of default constructors used both by Sia and DeSia.

The Sia and DeSia objects are the core components of the Sia library.

Basic usage

const { sia, desia } = require("sializer");

const buf = sia(data);
const result = desia(buf);

Sia class

const sia = new Sia({
  size = 33554432, // Buffer size to use
  constructors = builtinConstructors // An array of extra classes and types
});

const buf = sia.serialize(data);

Where size is the maximum size of buffer to use, use a big size if you're expecting to serialize huge objects. The constructors option is an array of extra types and classes, it includes instructions for serializing the custom types and classes.

DeSia class

const desia = new DeSia({
  mapSize = 256 * 1000, // String map size
  constructors = builtinConstructors, // An array of extra classes and types
});

const data = desia.deserialize(buf);

Where mapSize is the minimum size of string map array to use, use a big size if you're expecting to serialize huge objects. The constructors option is an array of extra types and classes, it includes instructions for deserializing the custom types and classes.

sia function

const buf = sia(data);

The sia function is the Sia.serialize method on an instance initialized with the default options.

desia function

const data = desia(buf);

The desia function is the DeSia.deserialize method on an instance initialized with the default options.

constructors

The constructors option is an array of extra types and classes that Sia should support. Here's an example of how to use it:

const { Sia, DeSia } = require("sializer");
const { constructors: builtins } = require("sializer");

const constructors = [
  ...builtins,
  {
    constructor: RegExp, // The custom class you want to support
    code: 7, // A unique positive code point for this class, the smaller the better
    args: (item) => [item.source, item.flags], // A function to serialize the instances of the class
    build(source, flags) { // A function for restoring instances of the class
      return new RegExp(source, flags);
    },
  },
];

const sia = new Sia({ constructors });
const desia = new DeSia({ constructors });

const regex = /[0-9]+/;
const buf = sia.serialize(regex); // serialize the data
const result = desia.deserialize(buf); // deserialize

sia's People

Contributors

amogower avatar clhuang avatar pouya-eghbali avatar rafaelsc avatar zircon63 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sia's Issues

Convert to ES6

We're doing CommonJS for some reason. I would like to see the library translated into either TypeScript or ES6.

Possible constructor search optimization

Since JS Maps and Sets support objects for keys, I'm wondering if you've done any benchmarking to see if using a Map instead of a linear array search using an iterator would optimize this code:

  itemToSia(item, constructor) {
    for (const entry of this.constructors) {
      if (entry.constructor === constructor) {
        return {
          code: entry.code,
          args: entry.args(item),
        };
      }
    }
    throw `Serialization of item ${item} is not supported`;
  }

What I'm thinking is populating the map up front using the constructor property as a key and having the entry be the value, then changing the above code to something like this:

  itemToSia(item, constructor) {
    const entry = this.constructors.get(constructor);
    if (entry) {
      return {
        code: entry.code,
        args: entry.args(item),
      };
    }
    throw `Serialization of item ${item} is not supported`;
  }

I ask because in our project we are up to around 20 custom constructors. I imagine there is a chance this won't be an optimization depending on modern javascript compilers, but technically the of is creating iterator objects every time, along with the fact that iterators return short lived objects for every element iterated (though, again, modern compilers may optimize this away), and its possible that for small Maps iterating is indeed faster than a Map lookup, but I would be curious what benchmarking shows?

Dynamic buffer size

I've checked the code and seems that if you serialize an object bigger than the buffer size, it will throw an error.

Is possible to add a check to increase the buffer temporary for that big serialization? I want to use it in an environment where users may try to serialize a REALLY big buffer (depending on their RAM) and don't want to just allocate a lot of memory for those 0.01% cases, but also don't want to get tickets for those 0.01% cases. Maybe an option to increase the buffer by jumps of X if the maximum is reached.

RangeError when serializing large Map

I tried to use this to serialize a large Map and I ran into this error:

RangeError [ERR_OUT_OF_RANGE]: The value of "offset" is out of range. It must be >= 0 and <= 33554431. Received 33554463
    at new NodeError (node:internal/errors:371:5)
    at boundsError (node:internal/buffer:86:9)
    at writeU_Int8 (node:internal/buffer:741:5)
    at Buffer.writeUInt8 (node:internal/buffer:748:10)
    at Sia.addString (/path/to/repo/node_modules/sializer/index.js:65:19)
    at Sia.serializeItem (/path/to/repo/node_modules/sializer/index.js:208:14)
    at Sia.serializeItem (/path/to/repo/node_modules/sializer/index.js:240:20)
    at Sia.serializeItem (/path/to/repo/node_modules/sializer/index.js:249:20)
    at Sia.serializeItem (/path/to/repo/node_modules/sializer/index.js:240:20)
    at Sia.serializeItem (/path/to/repo/node_modules/sializer/index.js:267:20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.