GithubHelp home page GithubHelp logo

sherlockbeard / delta-kernel-rs Goto Github PK

View Code? Open in Web Editor NEW

This project forked from delta-incubator/delta-kernel-rs

0.0 0.0 0.0 1.2 MB

A native Delta implementation for integration with any query engine

License: Apache License 2.0

C 5.76% Rust 93.71% Makefile 0.15% Handlebars 0.07% CMake 0.23% Just 0.08%

delta-kernel-rs's Introduction

delta-kernel-rs

Delta-kernel-rs is an experimental Delta implementation focused on interoperability with a wide range of query engines. It currently only supports reads.

The Delta Kernel project is a Rust and C library for building Delta connectors that can read (and soon, write) Delta tables without needing to understand the Delta protocol details. This is the Rust/C equivalent of Java Delta Kernel.

Crates

Delta-kernel-rs is split into a few different crates:

  • kernel: The actual core kernel crate
  • acceptance: Acceptance tests that validate correctness via the Delta Acceptance Tests
  • derive-macros: A crate for our derive-macros to live in
  • ffi: Functionallity that enables delta-kernel-rs to be used from C or C++ See the ffi directory for more information.

Building

By default we build only the kernel and acceptance crates, which will also build derive-macros as a dependency.

To get started, install Rust via rustup, clone the repository, and then run:

cargo test

This will build the kernel, run all unit tests, fetch the Delta Acceptance Tests data and run the acceptance tests against it.

As it is a library, in general you will want to depend on delta-kernel-rs by adding it as a dependency to your Cargo.toml. For example:

delta_kernel = "0.1"

Versions and Api Stability

We intend to follow Semantic Versioning. However, in the 0.x line, the APIs are still unstable. We therefore may break APIs within minor releases (that is, 0.1 -> 0.2), but we will not break APIs in patch releases (0.1.0 -> 0.1.1).

Documentation

Examples

There are some example programs showing how delta-kernel-rs can be used to interact with delta tables. They live in the kernel/examples directory.

Development

delta-kernel-rs is still under heavy development but follows conventions adopted by most Rust projects.

Concepts

There are a few key concepts that will help in understanding kernel:

  1. The Engine trait encapsulates all the functionality and engine or connector needs to provide to the Delta Kernel in order to read the Delta table.
  2. The DefaultEngine is our default implementation of the the above trait. It lives in engine/default, and provides a reference implementation for all Engine functionality. DefaultEngine uses arrow as its in-memory data format.
  3. A Scan is the entrypoint for reading data from a table.

Design Principles

Some design principles which should be considered:

  • async should live only in the Engine implementation. The core kernel does not use async at all. We do not wish to impose the need for an entire async runtime on an engine or connector. The DefaultEngine does use async quite heavily. It doesn't depend on a particular runtime however, and implementations could provide an "executor" based on tokio, smol, async-std, or whatever might be needed. Currently only a tokio based executor is provided.
  • Minimal Table API. The kernel intentionally exposes the concept of immutable versions of tables through the snapshot API. This encourages users to think about the Delta table state more accurately.
  • Prefer builder style APIs over object oriented ones.
  • "Simple" set of default-features enabled to provide the basic functionality with the least necessary amount of dependencies possible. Putting more complex optimizations or APIs behind feature flags
  • API conventions to make it clear which operations involve I/O, e.g. fetch or retrieve type verbiage in method signatures.

Tips

  • When developing, rust-analyzer is your friend. rustup component add rust-analyzer
  • If using emacs, both eglot and lsp-mode provide excellent integration with rust-analyzer. rustic is a nice mode as well.
  • When also developing in vscode its sometimes convenient to configure rust-analyzer in .vscode/settings.json.
{
  "editor.formatOnSave": true,
  "rust-analyzer.cargo.features": ["default-engine", "acceptance"]
}
  • The crate's documentation can be easily reviewed with: cargo docs --open

delta-kernel-rs's People

Contributors

nicklan avatar roeap avatar ryan-johnson-databricks avatar zachschuermann avatar rtyler avatar wjones127 avatar hntd187 avatar scovich avatar sherlockbeard avatar scarman-db avatar abhiaagarwal avatar abrassel avatar nkarpov avatar blajda avatar dennyglee avatar tlm365 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.