GithubHelp home page GithubHelp logo

eslab's Introduction

ESLab as Event Sourcing Lab

ESLab in a nutshell

ESLab is a realy simple event store relying on two main things:

  1. Avro, a very efficient serialization tool with a killer feature for event sourcing: Schema Resolution
  2. Java B-Tree on filesystem to acually persist events.

B-Tree implementations are pluggable and you can choose between:

  • Java BerkeleyDB aka Sleepycat
  • LevelDB, java version or the original one wrapped with JNI
  • MapDB, an interesting open source project that leverage on memory mapped files.

The purpose of ESLab is to demonstrate how easy is to build an event store. You can also use this project to compare these different B-Tree implementations. Also feel free to use it on your projects, to contribute to the code.

ESLab quickstart

Work in progress

It is quite straightforward to persist domain events with ESLab:

  1. Make your event classes implement the ESLab Event interface
  2. For each Event, implement a Serializer
  3. Choose an event store implementation based either on BerkeleyDB, LevelDB or MapDB

Take a look to classes SimpleEvent and SimpleSerializer in the test source directory for a serialization example. Below the code fragment that bootstrap an event store based on BerkeleyDB:

String tmpDir 												// the path to the folder 
	= System.getProperty("java.io.tmpdir"); 				// where the data will be written

Collection<Serializer> serializers = Lists.newArrayList(); 	// the event serializers that will be used to 
															// serialize and deserialize events by the store

serializers.add(new SimpleSerializer());					// you need one serializer per event type

BdbStore bdbStore = new BdbStore(tmpDir, serializers);		// the actual store instantiation
															// Here it is a BerkeleyDB store 
															// Change the class name for LevelDB or MapDB

Why implementing an event store using a B-Tree

An event represents a fact that has happened. In the DDD world, an event is related to a particular aggregate. An event store is used to:

  • persist events
  • retrieve an 'ordered' event stream related to a given aggregate.

What comes to mind to implement these two features is a key/value store. A B-Tree provide a key/value model and usualy allows to do range queries. Just what you need to persist and retrieve events, without the burden of a RDBMS!

Event sourcing and Avro schema resolution

An event store needs a serialization mechanism to serialize events. Once an events has been serialized, if the related java classes has evolved, the event store should still be able to deserialized it. That is why you need a flexible serializeer to do event sourcing, something really different from java standard serialization mechanism. We could use json or xml with something like XStream but such text based serialization are verbose ans hence disk expensive and not very efficient.

Apache Avro is a serialization tool that is part of the Hadoop ecosystem and is now a top level Apache project. Avro has performances comparable to products such as Google protocol buffer but unlike Google protocol buffer, Avro has been built with flexibility and dynamic languages in mind. If you do not know Avro you should definitely check out the Avros's Getting Started page. Unkike other tools, Avro does not require code generation. Like other tools, the structure of data serialized is specified by a schema, but unlike the other ones you can use one schema for serializing an object and an other one to deserialize the same object. This is a killer feature for event sourcong since your event classes migh evolve, fields might be added/removed/renamed without any big impact on your event store. For detailed information on this wonderful feature check out the Avro's Schema Resolution documentation.

How ESLab leverages on Avro

Within ESLab, Avro is used to serialized events. Since a schema is needed with Avro to be able to deserialize anything, we need to persit schemas as well. Within a stream of events, each event might need a different schema. We might have several event classes, event classes that might evolve over time. It would be very inneficient to store schemas along with events. Fortunately, Avro provides a fingerprint mechanism. It is very easy to generate a 64 bits fingerprint that identifies a schema. The overhead of storing this fingerprint with an event is quite acceptable. Events are stored in a B-Tree. Schemas are stored in another B-Tree. In the schema B-Tree, fingerprints are used as keys, schemas as values, actually JSON representations of the schemas. In the event B-Tree, the keys contains an aggregate id, a sequence number and a schema fingerprint. Below the different steps of the algorithm used to load an event stream:

  1. Perform a range query on the event B-Tree and get serialized data for each event of the stream.
  2. For each event, retrieve from the key a schema fingerprint
  3. Retrieve from the schema B-Tree a schema using the fingerprint
  4. Read in the schema found a record name (similar to a java class name)
  5. Use the record name to select the latest schema for this record name. When an event class has evolved, we can have several classes for the same record name in the schema B-Tree.
  6. Decode with Avro raw serialized data of the event, using the 2 schemas found (the one used to serialized the event and the one corresponding to the current java code).
  7. That's all folks. Some caching/optimizations have been ommited but you get the point!

Bench

TBD

eslab's People

Contributors

alexvictoor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

kmmanu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.