GithubHelp home page GithubHelp logo

well-typed / cborg Goto Github PK

View Code? Open in Web Editor NEW
185.0 24.0 86.0 4.09 MB

Binary serialisation in the CBOR format

Home Page: https://hackage.haskell.org/package/cborg

Haskell 99.86% C 0.09% Shell 0.05%
haskell

cborg's Introduction

Fast binary serialisation and CBOR implementation for Haskell

Haskell CI Hackage cborg version Hackage serialise version BSD3 Haskell

This repo contains two libraries (plus associated tools):

The serialise library is for serialising Haskell values and deserialising them later.

The cborg library provides a fast, standards-compliant implementation of the 'Concise Binary Object Representation' (specified in RFC 7049) for Haskell.

The serialise library uses the CBOR format, via the cborg library, which gives it the following benefits:

  • fast serialisation and deserialisation
  • compact binary format
  • stable format across platforms (32/64bit, big/little endian)
  • support for backwards compatible deserialisation with migrations
  • the ability to inspect binary values with generic tools, e.g. for debugging or recovery, including generic conversion into JSON text
  • potential to read the serialised format from other languages
  • incremental or streaming (de)serialisation
  • internal message framing (for use in network application)
  • suitable to use with untrusted input (resistance to asymmetric resource consumption attacks)

Installation

They are just a cabal install away on Hackage:

$ cabal install cborg serialise

There are also a few related packages that you may be interested in:

  • cborg-json implements the bijection between JSON and CBOR specified in the RFC.
  • cbor-tool is a handy command-line utility for working with CBOR data.

Join in

Be sure to read the contributing guidelines. File bugs in the GitHub issue tracker.

Master git repository:

  • git clone https://github.com/well-typed/cborg.git

The tests for the cborg package are currently included in the serialise package.

$ cabal test serialise

Authors

See AUTHORS.txt.

License

BSD3. See LICENSE.txt for the exact terms of copyright and redistribution.

cborg's People

Contributors

adamgundry avatar alexbiehl avatar anton-latukha avatar arianvp avatar arybczak avatar avieth avatar axman6 avatar bgamari avatar dcoutts avatar deepakkapiswe avatar elaforge avatar erikd avatar felixonmars avatar finleymcilwaine avatar fumieval avatar hvr avatar infinity0 avatar lehins avatar michalt avatar obadz avatar ondrap avatar parsonsmatt avatar phadej avatar quasicomputational avatar ryanglscott avatar saizan avatar shimuuar avatar sjakobi avatar thoughtpolice avatar timds avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cborg's Issues

UTCTime tag 1 is not handled

According to the RFC, encoded UTC time values can use a 0 tag to represent a serialized string in ISO-8601 format, or tag 1 to represent a numeric representation, of the number of seconds passed since the UNIX epoch.

As of today, we don't handle tag 1. That should be fixed before release (and is probably fairly easy).

See also #51 - it's important to make sure this case is efficient as well (and that's probably easier than the tag 0 case).

Push code coverage close to 100%

As of 40909fc, on Windows 10/GHC 7.8.4 64bit:

$ cabal configure --enable-tests --enable-coverage
$ cabal test
$ hpc report ...
 74% expressions used (4702/6332)
 31% boolean coverage (30/95)
      27% guards (23/83), 30 always True, 1 always False, 29 unevaluated
      58% 'if' conditions (7/12), 2 always True, 1 always False, 2 unevaluated
     100% qualifiers (0/0)
 65% alternatives used (606/931)
 79% local declarations used (49/62)
 76% top-level declarations used (234/305)

Or:

Screenshot

Make `demo-dump-cbor` more generally useful

We've been using CBOR in our projects for some time with good success, but one thing @Oblosys noted is demo-dump-cbor could be much, much more generally useful for all kinds of stuff.

This is a meta-ticket to keep track of various improvements (notably the ones originally filed by @Oblosys):

  • Allow demo-dump-cbor to parse files containing a sequence of cbor values (#78)
  • Add more output options for demo-dump-cbor (#77)
  • Don't require demo-dump-cbor input file to have a .cbor extension (#76)
  • Rename executables demo-dump-cbor and demo-aeson (#75)

and more. Please submit useful suggestions here and we can divvy them off or discuss for now, but I did want a place to keep track of overall improvements.

Error with 'C pre-processor' phase

I'm able to build the library fine but using cabal and stack tools using standalone build commands but when trying to load the library into a GHCi session I see the following problem with the Cpp preprocessor. Not sure if this is a binary-serialise-cbor problem or a problem upstream.

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/downloaded/5f34802a2e7a77d7e5b4b1c421cd4b93d91919f32c47b88fa95e3355911096b5.git/.stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/autogen/cabal_macros.h:178:0:
     warning: "CURRENT_PACKAGE_KEY" redefined [enabled by default]
     #define CURRENT_PACKAGE_KEY "binar_0SlfD4kPaIaKT7PcGaBUM0"
     ^

In file included from <command-line>:10:0: 

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/autogen/cabal_macros.h:157:0:
     note: this is the location of the previous definition
     #define CURRENT_PACKAGE_KEY "bench_2neKwoSHCjAHYmwpaCnoJF"
     ^

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/downloaded/5f34802a2e7a77d7e5b4b1c421cd4b93d91919f32c47b88fa95e3355911096b5.git/.stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/autogen/cabal_macros.h:178:0:
     warning: "CURRENT_PACKAGE_KEY" redefined [enabled by default]
     #define CURRENT_PACKAGE_KEY "binar_0SlfD4kPaIaKT7PcGaBUM0"
     ^

In file included from <command-line>:10:0: 

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/autogen/cabal_macros.h:157:0:
     note: this is the location of the previous definition
     #define CURRENT_PACKAGE_KEY "bench_2neKwoSHCjAHYmwpaCnoJF"
     ^

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/downloaded/5f34802a2e7a77d7e5b4b1c421cd4b93d91919f32c47b88fa95e3355911096b5.git/.stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/autogen/cabal_macros.h:178:0:
     warning: "CURRENT_PACKAGE_KEY" redefined [enabled by default]
     #define CURRENT_PACKAGE_KEY "binar_0SlfD4kPaIaKT7PcGaBUM0"
     ^

In file included from <command-line>:10:0: 

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/autogen/cabal_macros.h:157:0:
     note: this is the location of the previous definition
     #define CURRENT_PACKAGE_KEY "bench_2neKwoSHCjAHYmwpaCnoJF"
     ^

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/downloaded/5f34802a2e7a77d7e5b4b1c421cd4b93d91919f32c47b88fa95e3355911096b5.git/Data/Binary/Serialise/CBOR/Class.hs:47:0:
     error: missing binary operator before token "("
     #if MIN_VERSION_time(1,5,0)
     ^

/home/sdiehl/Git/alpha-sheets/backend/server/bench/serialise/.stack-work/downloaded/5f34802a2e7a77d7e5b4b1c421cd4b93d91919f32c47b88fa95e3355911096b5.git/Data/Binary/Serialise/CBOR/Class.hs:260:0:
     error: missing binary operator before token "("
     #if MIN_VERSION_time(1,5,0)
     ^
phase `C pre-processor' failed (exitcode = 1)

<no location info>:
    Could not find module ‘Data.Binary.Serialise.CBOR’
    It is a member of the hidden package ‘binary-serialise-cbor-0.1.1.0@binar_0SlfD4kPaIaKT7PcGaBUM0’.

Skip tag for single-constructor data types?

The generic serialisation and deserialisation has a special case for single-constructor-single-field data types (instance for GSerialiseEncode (K1 i a)), but does not introduce a special case for single-constructor-multiple-fields (instance for GSerialiseEncode (f :+: g)). This means if you have something like

data Foo = Foo {
    some   :: ..
  , record :: ..
  , type   :: ..
  , with   :: ..
  , lots   :: ..
  , of     :: ..
  , fields :: ..
  }

and we serialize a bunch of these, every one will have an unnecessary extra tag field.

Exorcise `binary` dependency

The only reason the library internally depends on binary is for the Decoder type. Given that we eventually want to replace binary altogether, I'm guessing we should probably get rid of this and use our own inline version. This will force users to have a separate (or rewritten) code path, but I imagine that's an overall small cost given the large API change. I'd guess we might as well force people into it if they're going to use the new interface, then they need a new Decoder.

/cc @dcoutts

Consider `streaming-bytestring`?

This is just a suggestion, and not at all an area I'm an authority on. But, I wonder what a deserialization library would look like if implemented using a streaming-bytestring, which is described in the README as "lazy ByteString done right".

Basically, it's implemented as a monad transformer, and thus readFile can actually perform (strict) IO, rather than be hidden under unsafeInterleaveIO.

Simple example fails

Code below fails with message Data.Binary.Serialise.CBOR.deserialise: failed at offset 0 : expected null

deserialise $ serialise (TInt 1)

Consider writing some CBOR extensions

There's a few CBOR spec extensions we might want to consider writing.

The purpose would be to get generic CBOR tools be able to decode some things we might want to use.

  • parallel arrays. Values that are logically arrays of records represented as a record of arrays.
  • compact single-type primitive arrays. Arrays of e.g. Int/Word16/32/64, Float, Double etc represented not as normal individually CBOR tagged values, but with one single tag up front and then all the values packed together (as a CBOR byte array). Could also include in the array-type tag if it's big or little endian. We'd loose the variable width integers for these but we could do ridiculously fast memcpy (or memcpy + bswap) encoding and decoding. Useful for scientific data.
  • bit arrays. This is much like the above, but the encoding has to be slightly different since it has to include a count of the bits.
  • multi-dimensional arrays. This is just the array bounds, lower and upper in each dimension, followed by the array data. Could use two tags to cover 0-based and explicit lower bound based.

Bikeshed the module namespace and some APIs

After a chat with @dcoutts earlier, there could probably be a bit of restructuring done in the package and renaming to make things a bit more fluid and consistent (exposing lazy vs strict interfaces, module hierarchy and module naming, etc).

We'll probably talk a bit more about this later; consider this issue a place holder.

LLVM code generation bug: decode . encode != id

I was running the serial-bench tests with CBOR commit ab0f193, and this turned up:

  test/Spec.hs:22: 
  1) cbor/cbor
       Falsifiable (after 27 tests and 4 shrinks): 
       expected: Just [SomeData 53169 70 55.3817683321392]
        but got: Just [SomeData (-12367) 70 55.3817683321392]
       [ArbSomeData {toSomeData = SomeData 53169 70 55.3817683321392}]

Faster vector support

This is mostly a brain dump for an idea in the interpreter that I want to try out.

The goal is to support faster boxed Vector/Array (and similar types) and also unboxed Vector/Array.

For the boxed case we need to be able to do certain key things in the ST monad. This can be purely in the interpreter, it does not have to leak into the decoders. We need to be able to allocate an ST array, and write elements into it and freeze the result. The idea is to add Decoder constructors to instruct the interpreter to do this. The interpreter would probably use a list-like stack structure to hold the arrays. The interpreter would have to move into ST.

For the unboxed case while it would be possible to take a similar approach, it may be better to instead tell the interpreter "please decode 37 floats now", and have the interpreter do that directly. In much the same way as it currently decodes byte strings or text strings (which are unbounded length encodings obviously).

One interesting challenge is efficiently decoding unboxed Vector style parallel arrays, ie when you've got something like Vector (Int, Float) then it's really represented as an array of int and an array of float. It would be faster to encode these things similarly in CBOR as a pair of arrays each containing just the one type. This would then let us use the above fast decoding of arrays of primitive types.

For example one of the micro-benchmarks is to encode/decode 1000 records with the structure (Int64, Word8, Double). If we're not trying to stream this then the best rep would of course be the parallel array style, and we could likely beat a fast implementation of the regular sequence-of-records style.

Bug in implementation of `decode` for Either

At https://github.com/well-typed/binary-serialise-cbor/blob/master/Data/Binary/Serialise/CBOR/Class.hs#L202 the recent PR to add more strictness (#5) failed to add in the Left and Right constructors when decoding Eithers. I haven't tested it, but I expect that at best this will always cause decoding to fail, and at worse could cause an infinite loop, but that's probably unlikely (the instance for Either is relying on its own instance recursively, the x's both have type Either a b)

Export properties from the main package, for client users

These properties need to be moved out of the test suite, since they're not really dependent on QuickCheck, and moved into the actual package itself.

These three properties are very convenient for any client user, because they can immediately add them as QuickCheck tests to their own test suite, for their own data type.

  • The FlatTerm properties ensure that encoders/decoders are correct, assuming a correct implementation of our library.
  • The actual serialise roundtrip will help catch any kind of weird bugs they may want to report to us.

This should be fixed, and a note added into the tutorial, encouraging users of this library to add these properties to their own test suites.

Support building with GHCJS

cc @dcoutts @kosmikus

Hey guys!

I'm trying (perhaps with a bit of foolish, Xmas spirit!) to install IRIS' big-kitchen-sink-restful-server using GHCJS, and as it depends upon binary-serialise-cbor I'm hitting an issue similar to the hashable one which has to do with unboxed constructors:

    Data/Binary/Serialise/CBOR/ByteOrder.hs:202:46:
        Couldn't match expected type ‘Word#’ with actual type ‘Word64#’
        In the first argument of ‘wordToFloat64#’, namely ‘w#’
        In the first argument of ‘D#’, namely ‘(wordToFloat64# w#)’

    Data/Binary/Serialise/CBOR/ByteOrder.hs:209:40:
        Couldn't match expected type ‘Word64#’ with actual type ‘Word#’
        In the third argument of ‘writeWord64Array#’, namely ‘w#’
        In the expression: writeWord64Array# mba# 0# w# s'

I suspect here the problem might lie in the different word size between the native vs the JS world. Do you guys think it's possible to issue a patch with a sane dose of CPP to make the package buildable on ghcjs? 😉 Is even possible at all?

Thanks a ton!

A.

Write the tutorial

There's a skeleton module in the repository right now. It should be filled out with many examples and wonderful prose.

Write a cool pretty printer for an `Encoding`

One of the nicest things about this library is that the Encoding type are really just "deep embeddings", or as we like to call them: syntax trees. This allows a variety of 'fun' things.

When you get an Encoding, that's really something like a function Tokens -> Tokens, and you apply a TkEnd to get a Tokens you can traverse over recursively.

It would be neat if we had a way to 'pretty print' this Encoding into a representation of what will actually be a CBOR value. This would be quite useful for visualising how a particular data type would be encoded into CBOR, or merely how it's structurally represented internally.

Support `Rational`, `Fixed` and perhaps `Scientific` via CBOR decimal fractions or binary fractions (big floats)

CBOR directly supports decimal and binary fractions. These are numbers represented as x*10^e or x*2^e. The mantissa 'x' can be a positive or negative small int or a CBOR big int. The exponent 'e' can only be a positive or negative small int (ie up to 64bit but not a big int)

And there is also an extension for rationals, ie x/y where both x and y can be small or big ints.

  • We should use the CBOR decimal fraction for our Data.Fixed support. Currently we just encode the Integer mantissa and leave the exponent implicit in the type. It'd be better for debugging and data recovery if the exponent was also stored.
  • We should use the CBOR rational extension for the Haskell Rational type.
  • The Scientific type from the scientific package exactly corresponds to a CBOR decimal fraction, and we should encode it as such. (Of course where that instance should live is a good question).

It's not clear that we have any standard types that are best represented by binary fractions. We don't have a standard big float type.

Serialise1

In the fashion of Show1 and friends.

This might be useful when writing instances by hand.

Somehow related to #15

Integrate LLVM builds on Travis

As noted on the tin. This will help prevent issues like #67 from cropping up in the future.

Unfortunately I don't think we'll be able to easily do this on Appveyor, meaning we can't cover the 32bit/LLVM codegen combo. But the amount of 32 bit cases are relatively small so this might be OK.

Convenient API for reading/writing sequence-style files incrementally

We have an incremental API already but it's incremental in this sense:

  • for output, given the whole value to serialise, the output is produced and can be emitted incrementally (in constant space)
  • for input, the input can be supplied to the decoder incrementally in chunks

Note that for output, we still need to supply the whole value to serialise (though it need not be fully evaluated) and for input we only get the whole value back at the end, not bit by bit.

In the general case it's a bit tricky to do much better than this (think big complicated tree-shaped data), but for files that are basically sequences of values then we should be able to get decoded elements one by one, or supply output elements one by one.

This is indeed possible and people are doing this already. The goal here is to provide something in the library that makes this more convenient.

See these two existing examples https://gist.github.com/dcoutts/798812e040a61ad969c27a45549943c0

One issue is putting a proper CBOR list header and footer in the file, so that it's not just a sequence of top level CBOR values (which is technically allowed by the standard but isn't well supported by existing tools). Another question is if there's any way to support variations like file headers as many real-world use cases would need some header info before a sequence, or perhaps multiple sequences.

Implement a new `Get` interface

The binary package has two major use cases:

  1. Serializing Haskell values to bytes and back, for transmission or storage.
  2. Parsing arbitrary binary data in externally defined formats into Haskell values.

Currently, this library implements 1. We have not implemented 2, nor have we put much thought into a design for a faster implementation of 2 than what binary offers. It might look something like .Read in the CBOR case. Or not.

Tidy up the Haddock documentation

In particular, almost all of the user-facing API should either have Haddocks, or if it's not generally useful, be privatized to the module. I've mostly gotten us there, but it'll need a few passes to finish up.

Rename the internal `Decoder` type

The Decoder type in the .Decoding module has an unfortunate naming conflict with the Decoder type from binary, which is mighty confusing. It's definitely confusing to me, considering they have nothing in common.

It wouldn't be a big deal if this type wasn't exposed, but it needs to be exposed for the tests.

It would probably be worth renaming this to avoid any potential confusion or whatever (hopefully with a low amount of bike shedding).

Think of an awesome name for this package

Before we release this package publicly, I think @dcoutts and I both agree the name is kind of a mouthful. It would be nice if this could have a shorter name and module space before we publicly release it and make everyone mad for breaking it.

This should be considered low priority (since, as stated, the eventual plan is for this to become binary itself), but would be nice to think about.

"Instance Bonanza" - add lots of instances

This package needs some love by adding instances to the Serialise class, which is currently somewhat lackluster. Adding a billion instances is what I like to call "Instance Bonanza", as it goes on for a while.

Essentially, anything within the scope of the Haskell Platform is probably fair game.

`Data.Binary.Serialise.CBOR.Term` is confusing

I'm a bit confused by CBOR.Term module. It seems to be used only for testing that the optimized and reference implementations do the same thing. But then I'd expect it to be inside the tests/ directory (instead of being exposed). If it's supposed to be used for testing also by the users, then it's still confusing, because there's CBOR.FlatTerm which seems to do that.

Add benchmarks for all core instances

Every instance we have should have a nice little microbenchmark to test things like encoding and decoding speed for individual cases.

Larger macro benchmarks will help catch more real regressions (even if a change might otherwise look good), so these are still useful for nice local optimizations.

/cc #15

Think about Strict ByteString APIs

Several of the APIs we have return or consume lazy ByteStrings, but many common cases or APIs involve one-shot (de)serialization of values with strict ones. This is a really common annoyance for a lot of people (even if I live with it) to have to import and use toStrict from bytestring, so it would be nice to avoid it.

We might want to change some of the naming of the exposed APIs a bit in order to accommodate this, too.

Vector instance decoder fails when peekAvailable returns 0 (and container is exactly chunkSize)

Test case:

module Main where

import Control.Monad
import qualified Data.Vector.Storable       as S
import qualified Data.ByteString            as BS
import qualified Data.ByteString.Lazy       as BL
import qualified Data.Binary.Serialise.CBOR as CBOR

main :: IO ()
main = do
  -- Split strict bytestring into chunks and return as lazy one
  let evilChunker i bs = let (b1,b2) = BS.splitAt i bs
                         in BL.fromChunks [b1,b2]
  -- Test case
  let ann = [S.replicate 128 (0::Double)]
      bs  = (BS.concat . BL.toChunks . CBOR.serialise) ann
  forM_ [1 .. BS.length bs - 1] $ \i -> do
    print i
    print $ ann == CBOR.deserialise (evilChunker i bs)

Output

GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling Main             ( testcase.hs, interpreted )
:Ok, modules loaded: Main.
*Main> :main
1
True
2
*** Exception: DeserialiseFailure 3 "expected list len"
*Main> 

I think it happens when list header get split between chunks. I run into this bug when trying to compress serialized data using gzip

Serialise vs Serialize

As discussed at the Haskell Boston meetup, the code contains both spellings of the word and there needs to be an issue and a pull request to fix that.

Generic default instances appear to be not significantly faster than cereal

I initially had several UTCTimes and ran into #51, observing cbor 10x slower than cereal. After removing them from the test case, the performance of this package and cereal appear about the same (~17 us for a roundtrip, with cereal being faster at deserializing and this package faster at serializing).

Here's my test case (which is a cleaned up version of one of the ADTs we serialize in our app, with UTCTimes replaced by Text).

instance CBOR.Serialise PPTS
instance CBOR.Serialise AMs
instance CBOR.Serialise AM
instance CBOR.Serialise SM
instance CBOR.Serialise CH
instance CBOR.Serialise RMs
instance CBOR.Serialise RM
instance CBOR.Serialise Im
instance CBOR.Serialise VDs

newtype SM = SM { _sm :: (HS.HashSet Text) }
  deriving (NFData, Eq, Show, Generic)

newtype CH = CH { _ch :: Maybe Text }
  deriving (NFData, Eq, Show, Generic)

newtype RMs = RMs { _rm :: [RM] }
  deriving (NFData, Eq, Show, Generic)

data Im = Im (HM.HashMap Text Text) VDs
    deriving (NFData, Eq, Show, Generic)

newtype VDs = VDs { _vlsL :: HM.HashMap Text (Text,Text) }
    deriving (NFData, Eq, Show, Generic)

data RM = RM Text [Text] [Text] Text
    deriving (NFData, Show, Eq, Generic)

data PPTS = PPTS SM CH Im RMs AMs (Maybe Text)
  deriving (NFData, Eq, Show, Typeable, Generic)

data AM = AM Text Text [Text] Text Int
   deriving (NFData, Show, Eq, Generic)

newtype AMs = AMs [AM] 
  deriving (NFData, Eq, Show, Generic)

-- make this UTCTime for a real test case:
fakeTime = "asdf-2345234-sasdf UTC"
ppts = 
 PPTS 
  (SM (HS.fromList ["asdf", "2345234 23452345", "asdfasdf", "2345"]))
  (CH $ Just "he dfdfdfdf dfdfddf llp")
  (Im (HM.fromList [("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f"),("sasd 5555555987","dff f")]) (VDs (HM.fromList [ ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")), ("sasd 55 344455555987",("dfffffffffff f","asdfasdfdddf")) ])))
  (RMs [
     RM (fakeTime) ["hello", "world"] ["hedddddddddddddddddddddddddddlp"] "somenaml"
   , RM (fakeTime) ["asdfasdfhello", "world"] ["hedddddddddddddddddddddddddddlp"] "somenaml"
   , RM (fakeTime) ["hello", "wasdfdforld"] ["hedddaddfdfdddddddddddddddddddddddddlp"] "somenaml"
   , RM (fakeTime) ["he444444444444444asdkhjasdfllo", "world"] ["hedddddddddddddddaddfdfdddddddddddddlp"] "somenaml"
    ])
  (AMs [
     AM (fakeTime) "fffffffffffffffffffffffff" ["s","adfasdfasdfasdf"] "hellpasd akjha" 9874587484845
   , AM (fakeTime) "fffffffffffffffffffffffff" ["s","adfasdfasdfasdf"] "hellpasd akjha" 9874345484845
   , AM (fakeTime) "fffffff     asdf ffffffffffffffffff" [] "hellpasd akjha" 3434345484845
   ])
  Nothing

-- and instance for `cereal`:

instance C.Serialize PPTS
instance C.Serialize AMs
instance C.Serialize AM
instance C.Serialize SM
instance C.Serialize CH
instance C.Serialize RMs
instance C.Serialize RM
instance C.Serialize Im
instance C.Serialize VDs
instance (Eq a, Hashable a, C.Serialize a) => C.Serialize (HS.HashSet a) where
  put = C.put . HS.toList
  get = HS.fromList <$> C.get
instance (Eq a, Hashable a, C.Serialize a, C.Serialize b) => C.Serialize (HM.HashMap a b) where
  put = C.put . HM.toList
  get = HM.fromList <$> C.get

instance C.Serialize Text where
  put = C.put . T.encodeUtf8
  get = T.decodeUtf8 <$> C.get

If I change the HashMap and HashSet to lists in the usual way performance looks like (for serializatioon + deserialization):

cereal: 23.15 μs +  19.54 μs
cbor:    17.17 μs + 19.30 μs

That's as far as I could justify tweaking the type. We're struggling with serialization performance but don't have the time to write and test definitions like these by hand: https://github.com/well-typed/binary-serialise-cbor/blob/master/bench/versus/Macro/CBOR.hs

Let me know if I'm missing something obvious, but otherwise I hope the above is a useful test case. Thanks for your work on this package!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.