Created by Stephen McDonald
CurioDB is a distributed and persistent Redis clone, built with Scala and Akka. Please note that this is a toy project, hence the name "Curio", and any suitability as a drop-in replacement for Redis is purely incidental. :-)
I've been using SBT to build the project, which you can install on OS X, Linux or Windows. With that done, you just need to clone this repository and run it:
$ git clone git://github.com/stephenmcd/curiodb.git
$ cd curiodb
$ sbt run "--config=path/to/config.file"
You can also build a binary (executable JAR file):
$ sbt assembly
$ ./target/scala-2.11/curiodb-0.0.1 --config=path/to/config.file
Note that the --config=path/to/config.file
argument is optional, see
the Configuration section below for more detail.
Why build a Redis clone? Well, I'd been learning Scala and Akka and wanted a nice project I could take them further with. I've used Redis heavily in the past, and Akka gave me some really cool ideas for implementing a clone, based on each key/value pair (or KV pair) in the system being implemented as an actor:
Since each KV pair in the system is an actor, CurioDB will happily use all your CPU cores, so you can run 1 server using 32 cores instead of 32 servers each using 1 core (or use all 1,024 cores of your 32 server cluster, why not). Each actor operates on its own value atomically, so the atomic nature of Redis commands is still present, it just occurs at the individual KV level instead of in the context of an entire running instance of Redis.
Since each KV pair in the system is an actor, the interaction between
multiple KV pairs works the same way when they're located across the
network as it does when they're located on different processes on a
single machine. This negates the need for features of Redis like "hash
tagging", and allows commands that deal with multiple keys (SUNION
,
SINTER
, MGET
, MSET
, etc) to operate seamlessly across a cluster.
Since each KV pair in the system is an actor, the unit of disk storage is the individual KV pair, not a single instance's entire data set. This makes Redis' abandoned virtual memory feature a lot more feasible. With CurioDB, an actor can simply persist its value to disk after some criteria occurs, and shut itself down until requested again.
Scala is concise, you get a lot done with very little code, but that's just the start - CurioDB leverages Akka very heavily, taking care of clustering, concurrency, persistence, and a whole lot more. This means the bulk of CurioDB's code mostly deals with implementing all of the Redis commands, so far weighing in at only a paltry 1,000 lines of Scala! Currently, the majority of commands have been fully implemented, as well as the Redis wire protocol itself, so existing client libraries can be used. Some commands have been purposely omitted where they don't make sense, such as cluster management, and things specific to Redis' storage format.
Since Akka Persistence is used for storage, many strange scenarios become available. Want to use PostgreSQL or Cassandra for storage, with CurioDB as the front-end interface for Redis commands? This should be possible! By default, CurioDB uses Akka's built-in LevelDB storage.
Here's a bad diagram representing one server in the cluster, and the flow of a client sending a command:
- An outside client sends a command to the server actor (Server.scala). There's at most one per cluster node (which could be used to support load balancing), and at least one per cluster (not all nodes need to listen for outside clients).
- Upon receiving a new outside client connection, the server actor will create a Client Node actor (System.scala), it's responsible for the life-cycle of a single client connection, as well as parsing the incoming and writing the outgoing protocol, such as the Redis protocol for TCP clients, or JSON for HTTP clients (Server.scala).
- Key Node actors (System.scala) manage the key space for the entire system, which are distributed across the entire cluster using consistent hashing. A Client Node will forward the command to the matching Key Node for its key.
- A Key Node is then responsible for creating, removing, and communicating with each KV actor, which are the actual Nodes for each key and value, such as a string, hash, sorted set, etc (Data.scala).
- The KV Node then sends a response back to the originating Client Node, which returns it to the outside client.
Not diagrammed, but in addition to the above:
- Some commands require coordination with multiple KV Nodes, in which case a temporary Aggregate actor (Aggregation.scala) is created by the Client Node, which coordinates the results for multiple commands via Key Nodes and KV Nodes in the same way a Client Node does.
- PubSub is implemented by adding behavior to Key Nodes and Client Nodes, which act as PubSub servers and clients respectively (PubSub.scala).
- Lua scripting is fully supported
(Scripting.scala) thanks to LuaJ, and is
implemented similarly to PubSub, where behavior is added to Key Nodes
which store and run compiled Lua scripts (via
EVALSHA
), and Client Nodes which can run uncompiled scripts directly (viaEVAL
).
Here are the few configuration settings and their default values that CurioDB implements, along with the large range of settings provided by Akka itself, which both use typesafe-config - consult those projects for more info.
curiodb {
// Addresses listening for clients.
listen = [
"tcp://127.0.0.1:6379" // TCP server using Redis protocol.
"http://127.0.0.1:2600" // HTTP server using JSON.
"ws://127.0.0.1:6200" // WebSocket server, also using JSON.
]
// Duration settings (either time value, or "off").
persist-after = 1 second // Like "save" in Redis.
sleep-after = 10 seconds // Virtual memory threshold.
expire-after = off // Automatic key expiry.
// Cluster nodes.
nodes = {
node1: "tcp://127.0.0.1:9001"
// node2: "tcp://127.0.0.1:9002"
// node3: "tcp://127.0.0.1:9003"
}
// Current cluster node (from the "nodes" keys above).
node = node1
// List of disabled commands.
commands.disabled = [SHUTDOWN]
}
You can also optionally provide your own configuration file, using the
--config=path/to/config.file
command-line argument. Your
configuration file need only define the values you wish to override.
For example, suppose you wanted to only listen over TCP, and disable
extra commands:
curiodb.listen = ["tcp://127.0.0.1:3333"]
curiodb.commands.disabled = [SHUTDOWN, DEL, FLUSHDB, FLUSHALL]
The sleep-after
and expire-after
settings are worth some
explanation. Each of these configure a time duration that starts each
time a command is run against a key, and elapses if no further commands
run against that key within the duration. Once the duration elapses,
an action is performed. After the sleep-after
duration elapses for a
key, it will persist its value to disk, and shut down, essentially
going to sleep - this is how virtual memory is implemented.
expire-after
is similar, but after its duration elapses, the key is
deleted entirely, just as if the EXPIRE
command was used.
As alluded to in the configuration example above, CurioDB also supports
a HTTP/WebSocket JSON API, as well as the same wire protocol that Redis
implements over TCP. Commands are issued with HTTP POST requests, or
WebSocket messages, containing a JSON Object with a single args
key,
containing an Array of arguments. Responses are returned as a JSON
Object with a single result
key.
HTTP:
$ curl -X POST -d '{"args": ["SET", "foo", "bar"]}' http://127.0.0.1:2600
{"result":"OK"}
$ curl -X POST -d '{"args": ["MGET", "foo", "baz"]}' http://127.0.0.1:2600
{"result":["bar",null]}
WebSocket:
var socket = new WebSocket('ws://127.0.0.1:6200');
socket.onmessage = function(response) {
console.log(JSON.parse(response.data));
};
socket.send(JSON.stringify({args: ["DEL", "foo"]}));
SUBSCRIBE
and PSUBSCRIBE
commands work as expected over WebSockets,
and are also fully supported by the HTTP API, by using chunked transfer
encoding to allow a single HTTP connection to receive a stream of
published messages over an extended period of time.
In the case of errors such as invalid arguments to a command, WebSocket
connections will transmit a JSON Object with a single error
key
containing the error message, while HTTP requests will return a
response with a 400 status, contaning the error message in the response
body.
- I haven't measured it, but it's safe to say memory consumption is much poorer due to the JVM. Somewhat alleviated by the virtual memory feature.
- It's slower, but not by as much as you'd expect. Without any optimization, it's roughly about half the speed of Redis. See the performance section below.
- PubSub pattern matching may perform poorly. PubSub channels are
distributed throughout the cluster using consistent hashing, which
makes pattern matching impossible. To work around this, patterns get
stored on every node in the cluster, and the
PSUBSCRIBE
andPUNSUBSCRIBE
commands get broadcast to all of them. This needs rethinking! - No transaction support.
- Lua scripts are not atomic.
Mainly though, Redis is an extremely mature and battle-tested project that's been developed by many over the years, while CurioDB is a one-man hack project worked on over a few months. As much as this document attempts to compare them, they're really not comparable in that light. That said, it's been tons of fun building it, and it has some cool ideas thanks to Akka. I hope others can get something out of it too.
These are the results of redis-benchmark -q
on an early 2014
MacBook Air running OS X 10.9 (the numbers are requests per second):
Benchmark | Redis | CurioDB | % |
---|---|---|---|
PING_INLINE |
57870.37 | 46296.29 | 79% |
PING_BULK |
55432.37 | 44326.24 | 79% |
SET |
50916.50 | 33233.63 | 65% |
GET |
53078.56 | 38580.25 | 72% |
INCR |
57405.28 | 33875.34 | 59% |
LPUSH |
45977.01 | 28082.00 | 61% |
LPOP |
56369.79 | 23894.86 | 42% |
SADD |
59101.65 | 25733.40 | 43% |
SPOP |
50403.23 | 33886.82 | 67% |
LRANGE_100 |
22246.94 | 11228.38 | 50% |
LRANGE_300 |
9984.03 | 6144.77 | 61% |
LRANGE_500 |
6473.33 | 4442.67 | 68% |
LRANGE_600 |
5323.40 | 3511.11 | 65% |
MSET |
34554.25 | 15547.26 | 44% |
Generated with the bundled benchmark.py script.
BSD.