Comments (5)
Marking gput for :+:
as INLINE[0] with the rest marked as INLINE seems to fix this. I assume this is since this way we end up inlining what amounts to put_con
since it's small early on. In phase zero we then inline the gput instance for product representations and this shortens out with the generic representation.
That is we use
instance ( GSumPut a, GSumPut b
, SumSize a, SumSize b) => GBinaryPut (a :+: b) where
{-# INLINE[0] gput #-}
with all other generic put instances being INLINE. The only question is then how big the impact of this would be on compile time.
Here is the full fast version of the Data.Binary.Generic
code for future reference. I might or might not go through the motions of putting up a patch seeing how there are some very old ones with no activity.
{-# LANGUAGE BangPatterns, CPP, FlexibleInstances, KindSignatures,
ScopedTypeVariables, TypeOperators, TypeSynonymInstances #-}
{-# LANGUAGE Safe #-}
{-# OPTIONS_GHC -fno-warn-orphans #-}
#if __GLASGOW_HASKELL__ >= 800
#define HAS_DATA_KIND
#endif
-----------------------------------------------------------------------------
-- |
-- Module : Data.Binary.Generic
-- Copyright : Bryan O'Sullivan
-- License : BSD3-style (see LICENSE)
--
-- Maintainer : Bryan O'Sullivan <[email protected]>
-- Stability : unstable
-- Portability : Only works with GHC 7.2 and newer
--
-- Instances for supporting GHC generics.
--
-----------------------------------------------------------------------------
module Data.Binary.Generic
(
) where
import Control.Applicative
import Data.Binary.Class
import Data.Binary.Get
import Data.Binary.Put
import Data.Bits
import Data.Word
#if !MIN_VERSION_base(4,11,0)
import Data.Monoid ((<>))
#endif
#ifdef HAS_DATA_KIND
import Data.Kind
#endif
import GHC.Generics
import Prelude -- Silence AMP warning.
-- Type without constructors
instance GBinaryPut V1 where
{-# INLINE gput #-}
gput _ = pure ()
instance GBinaryGet V1 where
gget = return undefined
-- Constructor without arguments
instance GBinaryPut U1 where
{-# INLINE gput #-}
gput U1 = pure ()
instance GBinaryGet U1 where
gget = return U1
-- Product: constructor with parameters
instance (GBinaryPut a, GBinaryPut b) => GBinaryPut (a :*: b) where
{-# INLINE gput #-}
gput (x :*: y) = gput x <> gput y
instance (GBinaryGet a, GBinaryGet b) => GBinaryGet (a :*: b) where
gget = (:*:) <$> gget <*> gget
-- Metadata (constructor name, etc)
instance GBinaryPut a => GBinaryPut (M1 i c a) where
{-# INLINE gput #-}
gput = gput . unM1
instance GBinaryGet a => GBinaryGet (M1 i c a) where
gget = M1 <$> gget
-- Constants, additional parameters, and rank-1 recursion
instance Binary a => GBinaryPut (K1 i a) where
{-# INLINE gput #-}
gput = put . unK1
instance Binary a => GBinaryGet (K1 i a) where
gget = K1 <$> get
-- Borrowed from the cereal package.
-- The following GBinary instance for sums has support for serializing
-- types with up to 2^64-1 constructors. It will use the minimal
-- number of bytes needed to encode the constructor. For example when
-- a type has 2^8 constructors or less it will use a single byte to
-- encode the constructor. If it has 2^16 constructors or less it will
-- use two bytes, and so on till 2^64-1.
#define GUARD(WORD) (size - 1) <= fromIntegral (maxBound :: WORD)
#define PUTSUM(WORD) GUARD(WORD) = putSum (0 :: WORD) (fromIntegral size)
#define GETSUM(WORD) GUARD(WORD) = (get :: Get WORD) >>= checkGetSum (fromIntegral size)
instance ( GSumPut a, GSumPut b
, SumSize a, SumSize b) => GBinaryPut (a :+: b) where
{-# INLINE[0] gput #-}
gput | PUTSUM(Word8) | PUTSUM(Word16) | PUTSUM(Word32) | PUTSUM(Word64)
| otherwise = sizeError "encode" size
where
size = unTagged (sumSize :: Tagged (a :+: b) Word64)
instance ( GSumGet a, GSumGet b
, SumSize a, SumSize b) => GBinaryGet (a :+: b) where
gget | GETSUM(Word8) | GETSUM(Word16) | GETSUM(Word32) | GETSUM(Word64)
| otherwise = sizeError "decode" size
where
size = unTagged (sumSize :: Tagged (a :+: b) Word64)
sizeError :: Show size => String -> size -> error
sizeError s size =
error $ "Can't " ++ s ++ " a type with " ++ show size ++ " constructors"
------------------------------------------------------------------------
checkGetSum :: (Ord word, Num word, Bits word, GSumGet f)
=> word -> word -> Get (f a)
checkGetSum size code | code < size = getSum code size
| otherwise = fail "Unknown encoding for constructor"
{-# INLINE checkGetSum #-}
class GSumGet f where
getSum :: (Ord word, Num word, Bits word) => word -> word -> Get (f a)
class GSumPut f where
putSum :: (Num w, Bits w, Binary w) => w -> w -> f a -> Put
instance (GSumGet a, GSumGet b) => GSumGet (a :+: b) where
getSum !code !size | code < sizeL = L1 <$> getSum code sizeL
| otherwise = R1 <$> getSum (code - sizeL) sizeR
where
sizeL = size `shiftR` 1
sizeR = size - sizeL
instance (GSumPut a, GSumPut b) => GSumPut (a :+: b) where
{-# INLINE putSum #-}
putSum !code !size s = case s of
L1 x -> putSum code sizeL x
R1 x -> putSum (code + sizeL) sizeR x
where
sizeL = size `shiftR` 1
sizeR = size - sizeL
instance GBinaryGet a => GSumGet (C1 c a) where
getSum _ _ = gget
instance GBinaryPut a => GSumPut (C1 c a) where
{-# INLINE putSum #-}
putSum !code _ x = put code <> gput x
------------------------------------------------------------------------
class SumSize f where
sumSize :: Tagged f Word64
#ifdef HAS_DATA_KIND
newtype Tagged (s :: Type -> Type) b = Tagged {unTagged :: b}
#else
newtype Tagged (s :: * -> *) b = Tagged {unTagged :: b}
#endif
instance (SumSize a, SumSize b) => SumSize (a :+: b) where
sumSize = Tagged $ unTagged (sumSize :: Tagged a Word64) +
unTagged (sumSize :: Tagged b Word64)
instance SumSize (C1 c a) where
sumSize = Tagged 1
from binary.
Encoding speed increase based on the variantions I tested:
binary-master: 1.0
binary INLINE: 1.2 (20% more throughput)
binary INLINE[0]-trick: 1.9 (+90% throughput).
This was collected using this cursed benchmark which while pretty unrealistic gives at least an idea about the potential impact of full specializaton.
You need to compile with -ffull-laziness for these results to make sense or you will just measure allocation performance.
main = do
let lists = replicate 50000 Var0
lbs = encode $ lists
lengths <- (sum <$>) $ forM [0..2000] $ \x -> do
return $! (BS.length $ BS.toStrict $ encode (lists)) + x
print lengths
from binary.
Related Issues (20)
- Cut release for GHC 8.6.1 HOT 5
- Binary Instances for ByteArray, Array, etc. HOT 1
- I can't just get the answer and builder out of Put in a single call. HOT 2
- The `Binary Double` instance does not roundtrip. HOT 2
- Decoder prematurely reports "not enough bytes" HOT 3
- Cut release for GHC 8.8.1 HOT 5
- MonadFix instance required
- Cut release for GHC 8.10.1 HOT 5
- How does one specify a fixed length string?
- binary doesn't install with ghc 8.10.3 and containers 0.6.4.1 HOT 2
- How to change default byteorder? HOT 4
- Add Solo instance
- Export failG
- Generically instance HOT 1
- How to test?
- Ineffective INLINE pragmas on `many` method for `Alternative` `Get` instance.
- Inconsistent INLINE pragmas between word and int for builders.
- Alternative instance for Get does not respect identity law in error situations
- Document safety expectations, or the lack thereof
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from binary.