GithubHelp home page GithubHelp logo

oyzg / cnosdb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cnosdb/cnosdb

0.0 0.0 0.0 2.5 MB

An Open Source Distributed Time Series Database with high performance, high compression ratio and high usability.

Home Page: https://www.cnosdb.com

License: MIT License

Rust 99.88% Makefile 0.12%

cnosdb's Introduction

CnosDB Isipho Road Map

Design Objectives of CnosDB Isipho

To design and develop a high performance, high compression ratio, highly available, distributed cloud native time series database, which meets the following objectives.

Storage

  1. Seperate storage and computation; theoretically uncapped support time series expansion; support horizontal/vertical scaling.
  2. Focus performance and cost balance; high performance io, Run-to-Completion scheduling model, support for hierarchical storage using object storage.
  3. Lossy compression with reduced precision at the user's option.

Query

  1. Implemented query engine by using Apache Arrow and Datafusion.
  2. Support for vectorized execution of the query engine to execute complex query statements.
  3. Support for standard SQL, Flux, rich aggregate queries and arithmetic.

Ecology

  1. Design for multi-tenant-oriented, providing more configuration parameters, able to provide more flexible configuration of resources.
  2. CDC, WAL can provide subscription and distribution to other nodes, more flexible deploy and support.
  3. Support Ecological compatibility with the K8s ecosystem, reducing shared memory.
  4. integration with other data ecosystems to support import/export of parquet files.
  5. Be compatible with major international and domestic public cloud ecosystems.

Module division of CnosDB Isipho

Store Engine

Important modules of Store Engine:

WAL: Write-Ahead Logging for recovering Memcache after downtime.

Memcache: memtable and immutmemtable in-memory cache data.

TSM: columnar storage format for temporal data.

Summary (TSM's MetaData): metadata file generated by TSM file version changes, used to recover data.

Versionset: global view of kv store, similar to storage manager.

Tsfamily: column cluster of series, a basic unit of LSM.

Compression of TSM: supports compression of multiple field types.

Lossy compression of TSM: support for data compression with reduced data precision.

Operatios of storage engine:

  1. write operation: grpc -> wal -> memcache.
  2. Read operation: support point query and range query, can read data from memtable and tsm.
  3. Token delete: delete files by compact, memtable data is cleared in real time.
  4. Flush: immutcache flush to L0 level smaller tsm files.
  5. Compact: TSM file merge after deletion.
  6. Other: configuration file, supports reading from environment variables and configuration.

Query Engine

  1. Implemented catalog provider
  2. schema storage
  3. Support assemble the data in TSM file into recordbatch in arrow.
  4. Support table scan, parse the data in the TSM file.
  5. Maintain system table & infomation schema for data statistics.
  6. Support orthogonal index
  7. Support inverted index
  8. Support custom indexing
  9. Support database management moudle
  10. Query optimization

Basic Lib Libraries by Isipho self

  1. fs: direct IO of user-state cache.
  2. schedule: Run-to-Completion model.

Distributed system

  1. Native distributed system based on Rust language
    1. Distributed architecture, overall framework design, computational storage separation
    2. Data slicing rules, based on consistent hashing
    3. Scaling and shrinking data migration
    4. Disaster tolerance switching
    5. Copy synchronization
    6. Operation and maintenance tools
    7. Data backup and restore
  2. Achieve eventual consistency
    1. Hinted-Handoff
    2. Read Repair
    3. Tombstone mechanism
    4. anti-entropy

Ecology

Cnosdb Isipho will supports and is compatible with many good ecologies.

  1. CnosDB Isipho must support

    1. line protocol
    2. http interface
  2. Ecological partner products that need to be supported

    1. Third-party data heterogeneity with CnosDB
    2. Telegraf
    3. Influxdb
    4. Prometheus
    5. Timescales
    6. Kafka/Pulsar
    7. MySQL/PgSQL
  3. Data management tools involving engine layer tools

    1. Data file backup (you can specify DB, shard, time period, etc.) and restore
    2. Export to row protocol (can follow several different dimensions: DB, shard, time period) and bulk import
    3. Disk files analysis tools (analysis of TSM, WAL, index files, etc., index reconstruction, file legitimacy checks, etc.)
    4. Data deletion by multiple dimensions: DB, table, shard, time period

Testing support

  1. Basic testing
  2. Continuous stress testing
  3. Cluster full-featured testing framework
  4. Chaos testing

Cnos Isipho Schedule

202207 JULY GO

kv:

  1. Read operation filter delete and add test cases
  2. Delete operation complete and add test cases
  3. compact merge iterator
  4. Support discrete point data production delta file
  5. Supplemental basic test cases build ci auto-run access control supplemental pressure test
  6. Configuration file, support reading from environment variables and configuration
  7. Optimize unnecessary locks (long term work)

query:

  1. impl catalog provider
  2. Schema storage docking arrow schema
  3. Assemble the data in TSM file into recordbatch in arrow

202208

202209

202210

202211

202212

cnosdb's People

Contributors

roseboy-liu avatar zipper-meng avatar subsegment avatar vrg000 avatar cnoshb avatar gilbert-liang avatar heropku avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.