GithubHelp home page GithubHelp logo

cpak's Introduction

cpak

npm version

cpak is a compressed data format for numeric data, embeddable in code and text documents. It's especially suitable for geometry and geographic data.

This package contains the spec (soon) and some supporting functions. For reading and writing geographic data, use (for example) cgeo-cpak

The specification consists of multiple levels:

  • Level 0: Suitable for any numeric data. Necessary JavaScript code (and no more) is included in this package.
  • Level 1: Defines an encoding for SQL/MM Spatial data types. Additional convention or configuration is required to represent WKB and WKT data.
  • Level 2 (future): Full replacement for WKB and WKT.

Level 0

Unsigned integers are encoded using the 92 printable non-whitespace ASCII characters that survive unchanged through JSON.stringify. These form 2 groups containing 64 and 28 characters, used as "digits" in base 64 or 28.

Signed integers are zigzag encoded first, in the same way as protocol buffers:

unsigned = (signed << 1) ^ (signed >> 31);
signed = (unsigned >>> 1) ^ -(unsigned & 1);

Digits are stored most significant first, and the last digit always comes from a different group than any preceding ones, to distinguish where the next number begins.

Numbers expected to be smaller than 64 use the 64-character group for their last "digit". That allows usually storing them in one byte. Larger numbers use the 28-character group for their last "digit" and 64-character group for the other digits. That allows usually storing them in the same amount of space that Base64 would take.

Characters left unused due to incompatibility with JSON are " and \ (ASCII 34 and 92). It turns out the following formulas convert between digits and ASCII characters without requiring lookup tables:

ascii = digit + (((digit + 1934) * 9) >> 9);
digit = ascii - (((ascii + 1900) * 9) >> 9);

The same code works at least in JavaScript, C and C++. All those languages also support string literals with cpak encoded data as-is.

digit is a number between 0-91, where 0-63 forms one group and 64-91 is the other. ascii is a character code from the following string, indexed by digit:

!#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~

Level 1

Encoding is similar to WKB (Well-Known Binary) with endianness flags omitted and all integers stored in a variable-length format.

Geometry type IDs are multiplied by 2 compared to WKB. This allows the least significant bit to indicate spec level.

  • 0 means level 1.
  • 1 means level 2 (or higher).

Floating point values need to be rounded to integers before encoding. Multiplying them by some factor first allows keeping fractional digits up to a desired precision. Powers of 2 and 10 both have their advantages when choosing a multiplication factor:

  • Base 2 allows easy further lossless conversion to WKB or shapefiles and back.
  • Base 10 allows easy further lossless conversion to WKT or GeoJSON and back.

The same multiplication factor must be used when reading and writing, and level 1 leaves it to the application to guarantee this. Embedding cpak-encoded geometry into JSON is the recommended way to allow storing additional metadata.

License

Dual-licensed under:

The MIT License

CC0

Copyright (c) 2017 BusFaster Ltd

cpak's People

Contributors

jjrv avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.