GithubHelp home page GithubHelp logo

stereodb / stereodb Goto Github PK

View Code? Open in Web Editor NEW
187.0 6.0 3.0 952 KB

Ultrafast and lightweight in-process memory database written in F# that supports: transactions, secondary indexes, persistence, and data size larger than RAM.

License: Apache License 2.0

C# 16.70% F# 82.89% Dockerfile 0.41%
caching csharp database dotnet dotnet-core fsharp in-memory-database hacktoberfest

stereodb's Introduction

StereoDB logo

build NuGet

StereoDB

Ultrafast and lightweight in-process memory database written in F# that supports: transactions, secondary indexes, persistence, and data size larger than RAM. The primary use case for this database is building Stateful Services (API or ETL Worker) that keep all data in memory and can provide millions of RPS from a single node.

Supported features:

  • C# and F# API
  • Basic SQL support
  • Transactions (read-only, read-write)
  • Secondary Indexes
    • Value Index (hash-based index)
    • Range Scan Index
  • Data size larger than RAM
  • Data persistence
  • Distributed mode
    • Server and client discovery
    • Range-based sharding

Intro to Stateful Services

StereoDB logo

Benchmarks

Pure KV workload benchmark (in-process only, without persistence). In this benchmark, we run concurrently 3 million random reads and 100K random writes in 892 ms.

BenchmarkDotNet=v0.13.5, OS=Windows 11 (10.0.22621.2134/22H2/2022Update/SunValley2)
AMD Ryzen 7 5800H with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK=7.0.400
  [Host]     : .NET 7.0.10 (7.0.1023.36312), X64 RyuJIT AVX2
  DefaultJob : .NET 7.0.10 (7.0.1023.36312), X64 RyuJIT AVX2

|    Method | ReadThreadCount | WriteThreadCount | UsersCount | DbReadCount | DbWriteCount |     Mean |    Error |   StdDev | Allocated |
|---------- |---------------- |----------------- |----------- |------------ |------------- |---------:|---------:|---------:|----------:|
| ReadWrite |              30 |               30 |    4000000 |     3000000 |       100000 | 891.9 ms | 17.75 ms | 35.86 ms |  13.12 KB |

C# API

using System;
using StereoDB;
using StereoDB.CSharp;

// defines a Book type that implements IEntity<TId>
public record Book : IEntity<int>
{
    public int Id { get; init; }
    public string Title { get; init; }
    public int Quantity { get; init; }
}

// defines an Order type that implements IEntity<TId>
public record Order : IEntity<Guid>
{
    public Guid Id { get; init; }
    public int BookId { get; init; }
    public int Quantity { get; init; }
}

public class BooksSchema
{
    public ITable<int, Book> Table { get; init; }
}

public class OrdersSchema
{
    public ITable<Guid, Order> Table { get; init; }
    public IValueIndex<int, Order> BookIdIndex { get; init; }
}

// defines a DB schema that includes Orders and Books tables
// and a secondary index: 'BookIdIndex' for the Orders table
public class Schema
{
    public BooksSchema Books { get; }
    public OrdersSchema Orders { get; }
    
    public Schema()
    {
        Books = new BooksSchema
        {
            Table = StereoDb.CreateTable<int, Book>()
        };

        var ordersTable = StereoDb.CreateTable<Guid, Order>();

        Orders = new OrdersSchema()
        {
            Table = ordersTable,
            BookIdIndex = ordersTable.AddValueIndex(order => order.BookId)
        };
    }
}

public static class Demo
{
    public static void Run()
    {
        var db = StereoDb.Create(new Schema());

        // 1) adds book
        // WriteTransaction: it's a read-write transaction: we can query and mutate data
        
        db.WriteTransaction(ctx =>
        {
            var books = ctx.UseTable(ctx.Schema.Books.Table);
        
            foreach (var id in Enumerable.Range(0, 10))
            {
                var book = new Book {Id = id, Title = $"book_{id}", Quantity = 1};
                books.Set(book);
            }
        });
               
        // 2) creates an order
        // WriteTransaction: it's a read-write transaction: we can query and mutate data
        
        db.WriteTransaction(ctx =>
        {
            var books = ctx.UseTable(ctx.Schema.Books.Table);
            var orders = ctx.UseTable(ctx.Schema.Orders.Table);
        
            foreach (var id in books.GetIds())
            {
                if (books.TryGet(id, out var book) && book.Quantity > 0)
                {
                    var order = new Order {Id = Guid.NewGuid(), BookId = id, Quantity = 1};
                    var updatedBook = book with { Quantity = book.Quantity - 1 };
                    
                    books.Set(updatedBook);
                    orders.Set(order);
                }
            }
        });
                
        // 3) query book and orders
        // ReadTransaction: it's a read-only transaction: we can query multiple tables at once
        
        var result = db.ReadTransaction(ctx =>
        {
            var books = ctx.UseTable(ctx.Schema.Books.Table);
            var bookIdIndex = ctx.Schema.Orders.BookIdIndex;
        
            if (books.TryGet(1, out var book))
            {
                var orders = bookIdIndex.Find(book.Id).ToArray();
                return (book, orders);
            }
            
            return (null, null);
        });    
    }    
}

F# API

F# API has some benefits over C# API, mainly in expressiveness and type safety:

  • Anonymous Records - It provides in place schema definition. You don't need to define extra types for schema as you do with C#. Also, it helps you model efficient (zero-cost, since it supports structs) and expressive - return result type.
  • ValueOption<'T> - It's used for StereoDB API to model emptiness in a type safe manner. Also, it's a zero-cost abstraction since it's struct.
  • Computation Expression - It helps to express multiple if & else checks on emptiness/null for ValueOption<'T>, into a single voption { } expression. To use voption { }, FsToolkit.ErrorHandling should be installed. In the case of voption {}, it's also a zero-cost abstraction, the compiler generates optimized code without allocations.
open System
open FsToolkit.ErrorHandling
open StereoDB
open StereoDB.FSharp

// defines a Book type that implements IEntity<TId>
type Book = {
    Id: int
    Title: string
    Quantity: int    
}
with
    interface IEntity<int> with
        member this.Id = this.Id

// defines an Order type that implements IEntity<TId>
type Order = {
    Id: Guid
    BookId: int
    Quantity: int    
}
with
    interface IEntity<Guid> with
        member this.Id = this.Id

// defines a DB schema that includes Orders and Books tables
// and a secondary index: 'BookIdIndex' for the Orders table
type Schema() =
    let _books = {| Table = StereoDb.createTable<int, Book>() |}
    
    let _ordersTable = StereoDb.createTable<Guid, Order>()
    let _orders = {|
        Table = _ordersTable
        BookIdIndex = _ordersTable.AddValueIndex(fun order -> order.BookId)
    |}
    
    member this.Books = _books
    member this.Orders = _orders

let test () =
    let db = StereoDb.create(Schema())

    // 1) adds book
    // WriteTransaction: it's a read-write transaction: we can query and mutate data

    db.WriteTransaction(fun ctx ->
        let books = ctx.UseTable(ctx.Schema.Books.Table)

        let bookId = 1
        let book = { Id = bookId; Title = "book_1"; Quantity = 1 }
        books.Set book
    )

    // 2) creates an order
    // WriteTransaction: it's a read-write transaction: we can query and mutate data

    db.WriteTransaction(fun ctx ->
        let books = ctx.UseTable(ctx.Schema.Books.Table)
        let orders = ctx.UseTable(ctx.Schema.Orders.Table)        
        
        voption {
            let bookId = 1
            let! book = books.Get bookId

            if book.Quantity > 0 then
                let order = { Id = Guid.NewGuid(); BookId = bookId; Quantity = 1 }
                let updatedBook = { book with Quantity = book.Quantity - 1 }
                
                books.Set updatedBook
                orders.Set order                    
        }
        |> ignore                     
    )

    // 3) query book and orders
    // ReadTransaction: it's a read-only transaction: we can query multiple tables at once

    let result = db.ReadTransaction(fun ctx ->
        let books = ctx.UseTable(ctx.Schema.Books.Table)
        let bookIdIndex = ctx.Schema.Orders.BookIdIndex
        
        voption {
            let bookId = 1
            let! book = books.Get 1
            let orders = book.Id |> bookIdIndex.Find |> Seq.toArray
            
            return struct {| Book = book; Orders = orders |}
        }
    )    

Transactions

StereoDB transactions allow the execution of a group of commands in a single step. StereoDB provides Read-Only and Read-Write transactions.

  • Read-Only allows you only read data. Also, they are multithreaded.
  • Read-Write allows you read and write data. They are running in a single-thread fashion.

What to expect from transactions in StereoDB:

  • they are blazingly fast and cheap to execute.
  • they guarantee you atomic and consistent updates (you can update several tables including secondary indexes in one transaction and no other concurrent transaction will read your data partially; the transaction cannot be observed to be in progress by another database client).
  • they don't support rollback since supporting rollbacks would have a significant impact on the simplicity and performance of StereoDB.

In terms of ACID, StereoDB provides: TBD

How to deal without rollbacks?
  • we suggest to use only immutable data types to model your data. In C#/F# you can use records/structs to achieve this. Immutable data types allow you to ignore partial failures while updating any data record in memory.
  • run all necessary validation before updating data in tables. Mutating your database should be the latest transaction step after all necessary validations are passed.
// it's an example of WriteTransaction
db.WriteTransaction(ctx =>
{
    var books = ctx.UseTable(ctx.Schema.Books.Table);    

    // read record
    var bookId = 42;
    if (books.TryGet(bookId, out var book) && book.Quantity > 0)
    {
        // update record (it's immutable type, so the book instance wasn't mutated)        
        var updatedBook = book with { Quantity = book.Quantity - 1 };

        // and only after this you can safely mutate you state in database
        books.Set(updatedBook);
    }    
});

Secondary indexes

TBD

Best practices

TBD

stereodb's People

Contributors

antyadev avatar kant2002 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

stereodb's Issues

First time warning

Any ideas why I have this warnings on first build?

FSC : warning FS1063: Unknown --test argument: 'GraphBasedChecking' [src\StereoDB\StereoDB.fsproj]
FSC : warning FS1063: Unknown --test argument: 'ParallelOptimization' [src\StereoDB\StereoDB.fspro
j]
FSC : warning FS1063: Unknown --test argument: 'ParallelIlxGen' [src\StereoDB\StereoDB.fsproj]

Add support for BETWEEN

Would be great if somebody can add support to BETWEEN operator in the WHERE

To add parsing support somebody take a look at these places

type SqlLogicalExpression =
| BinaryLogicalOperator of SqlLogicalExpression * string * SqlLogicalExpression
| BinaryComparisonOperator of SqlExpression * string * SqlExpression
| UnaryLogicalOperator of string * SqlLogicalExpression
| IsNull of SqlExpression
| IsNotNull of SqlExpression

https://github.com/StereoDB/StereoDB/blob/4bce3c23b652d3dfd5c911e4512e386fa5c2b947/src/StereoDB/Infra/Sql/SqlParser.fs#L118C12-L122

And then emit proper code approximately here

member this.buildLogicExpression row expression :Expression =
match expression with
| BinaryLogicalOperator (left, op, right) -> failwith "Not implemented"
| BinaryComparisonOperator (left, op, right) ->
let leftExpression = this.buildExpression row left
let rightExpression = this.buildExpression row right
match op with
| "<=" -> Expression.LessThanOrEqual(leftExpression, rightExpression)
| "<" -> Expression.LessThan(leftExpression, rightExpression)
| ">=" -> Expression.GreaterThanOrEqual(leftExpression, rightExpression)
| ">" -> Expression.GreaterThan(leftExpression, rightExpression)
| "<>" -> Expression.NotEqual(leftExpression, rightExpression)
| "=" -> Expression.Equal(leftExpression, rightExpression)
| _ -> failwith $"Operator {op} is not implemented"
| UnaryLogicalOperator (op, expr) -> failwith "Not implemented"
| IsNull (expr) -> failwith "Not implemented"
| IsNotNull (expr) -> failwith "Not implemented"

Don't forget to add tests

let ``Select filtered rows`` () =

Simplified StereoDB builder API

AC:

  • - redesign StereoDB API to create db as a black box without forcing the user to implement IStereoDB
  • - merge CSharp API and FSharp in one project
  • - use shared implementation of StereoDB Table and StereoDB Engine
var db = StereoDb.Create(new Schema());

Benmak results

I slightly modify benchmark results, to at least have understnading how things can change and give this results.

|    Method | ReadThreadCount | WriteThreadCount | UsersCount | DbReadCount | DbWriteCount |       Mean |    Error |    StdDev | Allocated |
|---------- |---------------- |----------------- |----------- |------------ |------------- |-----------:|---------:|----------:|----------:|
| ReadWrite |              30 |                2 |    4000000 |     3000000 |       100000 |   981.0 ms | 19.44 ms |  30.83 ms |   3.34 KB |
| ReadWrite |              30 |                6 |    4000000 |     3000000 |       100000 | 1,133.1 ms | 44.52 ms | 131.26 ms |   3.46 KB |
| ReadWrite |              30 |               30 |    4000000 |     3000000 |       100000 | 1,014.8 ms | 20.35 ms |  56.39 ms |   5.03 KB |

Why WriteThreadCount=6 has bump in processing time is unexplainable for me.

Implicit data-type conversion

We need provide support for implicit type conversion, since I can guarantee we will have a lot of pissing of users. Example

CREATE TABLE Data(
   Id int32 not null,
   Code int64 not null
) 

Let's say we will write following query

SELECT * from Data WHERE Id = Code

This may blow up in runtime, since we will write filter approximately like this

let filter row =
    let a = row.Id
    let b = row.Code
    let result = Data.operator == (Int32, Int32)
    result

In this case runtime may properly perform conversion for us, but what about other datatypes? For example string and int comparison, or int and float. float and datetime.

Currently SQL compiler build expression type based on the type of the column and have fixed type for constants. If we wrote rules for the data type conversions, implicit and explicit we should transform original expression and insert casts in the SqlExpression where appropriate.

SQL Support

This is umbrella work item for all future work on SQL support

I envision following operations for alpha iteration

  • SELECT 5
    • Numbers support
    • Strings support
    • #20
  • FROM Table
    • Table scan
    • Clustered Index seek
    • Non-clustered Index seek
    • Index scan
  • JOIN tables
    • INNER
    • LEFT/RIGHT/FULL
  • WHERE
    • basic comparison (<>, =)
    • IS NULL, IS NOT NULL
    • LIKE
    • BETWEEN
  • UPDATE support on single table
    • without WHERE
    • with WHERE
  • DELETE support on single table
    • without WHERE
    • with WHERE
  • ORDER BY
    • experessions
    • ORDER BY 1,2

Not part of alpha iteration

  • Multiple resultset
  • GROUP BY
  • Aggregated functions
  • Window functions
  • Tabular functions
  • Stored procedures

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.