srotya / sidewinder Goto Github PK
View Code? Open in Web Editor NEWFast and scalable timeseries database
Home Page: http://sidewinder.srotya.com
License: Apache License 2.0
Fast and scalable timeseries database
Home Page: http://sidewinder.srotya.com
License: Apache License 2.0
Summary:
Create a ranger plugin so to support Access Control via Apache Ranger
Description:
Apache Ranger provides a sophisticated access control layer that integrates well with other big data projects such as Hive, HDFS etc. Adding Ranger plugin integration will provide capabilities to leverage Apache Ranger so that ACLs can be controlled for Sidewinder.
Documentation to creating a Ranger plugin: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741207
Note: The Ranger Plugin MUST be created as a separate child project from sidewinder-parent
Disk storage engine persists Sidewinder data on disk and provides an option for persistent use cases where write throughput can be traded for durability.
REST API needed to delete tags from Tag Index so that storage space can be freed up.
Currently the following APIs are supported:
DELETE /databases (com.srotya.sidewinder.core.api.DatabaseOpsApi)
GET /databases (com.srotya.sidewinder.core.api.DatabaseOpsApi)
DELETE /databases/{dbName} (com.srotya.sidewinder.core.api.DatabaseOpsApi)
GET /databases/{dbName} (com.srotya.sidewinder.core.api.DatabaseOpsApi)
PUT /databases/{dbName} (com.srotya.sidewinder.core.api.DatabaseOpsApi)
GET /databases/{dbName}/check (com.srotya.sidewinder.core.api.DatabaseOpsApi)
DELETE /databases/{dbName}/measurements/{measurementName} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
GET /databases/{dbName}/measurements/{measurementName} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
PUT /databases/{dbName}/measurements/{measurementName} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
GET /databases/{dbName}/measurements/{measurementName}/check (com.srotya.sidewinder.core.api.MeasurementOpsApi)
GET /databases/{dbName}/measurements/{measurementName}/fields (com.srotya.sidewinder.core.api.MeasurementOpsApi)
GET /databases/{dbName}/measurements/{measurementName}/fields/{value} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
PUT /databases/{dbName}/measurements/{measurementName}/series (com.srotya.sidewinder.core.api.MeasurementOpsApi)
GET /databases/{dbName}/measurements/{measurementName}/series/count (com.srotya.sidewinder.core.api.MeasurementOpsApi)
PUT /databases/{dbName}/measurements/{measurementName}/series/retention/{retentionPolicy} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
POST /databases/{dbName}/query (com.srotya.sidewinder.core.api.DatabaseOpsApi)
POST /influx (com.srotya.sidewinder.core.api.InfluxApi)
POST /sql/database/{dbName} (com.srotya.sidewinder.core.api.SqlApi)
Summary:
Support other authentication mechanisms besides basic auth.
Description:
Currently there's support for only basic authentication. This ticket requests support for additional authentication mechanism for the Sidewinder HTTP API (all of them)
Problem:
Time series buckets are written in static size buffer chunks, this is done as standard memory allocation technique. Therefore, currently a single time bucket can get defragmented over time for a given bucket. Reading across different slices for the same bucket may not be sequential when it comes to actual locality of data in-memory. Additionally, the buffers may be not be compact enough therefore wasting substantial disk space.
Compaction:
Compaction in case of Sidewinder is the process of merging these fragmented buffers using a better compression algorithm (optional). To reduce the amount of disk space used and improve linear read performance.
Coordinator implementation (with Atomix) were recently released to master branch. This needs to be extended to support Measurement clustering. This implementation will involve:
Summary:
Add support for Spark DataFrames
Description:
Add support for Spark DataFrames
Time unit dropdown for Grafana plugin doesn't work; time unit is hard set to seconds.
Summary:
Garbage collector in new DiskStorageEngine is broken.
Description:
New Disk Storage Engine garbage collector only clears pointers from memory but doesn't actually delete files from disk.
Disk storage engine is utilizing too much heap for the number of objects that are created. This causes an upper limit on the number of data points being stored as the number of TimeSeriesBucket objects are limited due to heap size.
Summary:
There's currently no official benchmark harness for Sidewinder, please add one.
Description:
Benchmark harness should provide a standard set of tools to run read/write performance benchmarks for users to validate the hardware and software configuration is suitable for their use case.
Summary:
Refactor Grafana Plugin
Description:
To push grafana plugin to the plugin repository, it needs to be moved to a separate project. grafana/grafana-plugin-repository#144
Summary:
Support for multiple data directories
Description:
Disk storage engine currently only supports a single data directory via the data.dir
configuration. Add support so that data can be sharded across multiple disk drives removing IO bottlenecks if any.
Summary:
Support for API Authentication (basic)
Description
Provide support for authentication on the REST API to allow / deny access to the databases.
Sidewinder needs extensive test suite to ensure data points are not dropped and data corruption does not happen even in the face of aggressive failures.
Summary:
Make time bucket constant for series configurable & persist metadata for disk storage
Description:
Time bucket is a constant right now set to 4096
seconds at the moment, this makes it difficult to store historical data due to max open file limit. Until #31 addresses this problem, as a stop-gap measure we need this configurable from external configuration and if possible on a database by database basis using a REST API.
Internal metrics for CPU and Memory for sidewinder stop after a few minutes indicating either a deadlock or data drop somewhere.
Build a simple netty based server for Sidewinder to accept graphite protocol (TCP & UDP) and act as proxy to forward data to Sidewinder over GRPC protocol.
Netty for both HTTP and Binary ingestion has performance issues including:
Summary:
Autocomplete in Grafana measurement selector is not working due to an NPE.
Description:
Autocomplete in Grafana measurement selector is not working due to an NPE.
Create a naive master slave clustering system.
Master slave based clustering system provides data replication and sharding for queries. In this model master node will receive all writes which are then replicated over to one or more slave nodes. The slave nodes can be used for read only operations like queries allowing read scaling. Because this initial implementation is naive, master won't automatically failover using leader election process. If a master is not functional the slave will need to be manually promoted by applying the configuration and restarting.
Tags are currently not cleaned up even though their related time series may have been garbage collected. Need an automated way to cleanup tags.
Add timeseries GCed series archival
Disk Archival
HDFS Archival
S3 Archival
Use cases are requiring that the auto-correlate feature in grafana be activated so that series that relate to the queried series can automatically be pulled from the database for time series correlation.
Graphite schema for decoder is not correct.
Graphite decoder should put the last key as the value field name and the second last key as the measurement and the rest of them as tags.
Need instructions / documentation on how to configure REST API for SSL encryption. Can SSL be supported for GRPC API?
Clustering for Sidewinder should ideally be verified and tested with Jepsen: https://jepsen.io/
Redesign storage engine to support hundreds of thousands of independent time series.
The current Sidewinder Disk Storage engine stores one unique time series bucket per file, the size of the bucket is configurable however, it still posses a restriction on how many unique timeseries can there be on a given server as the number of open files is limited.
While the dc4d448 does try avoid the issue of max open files by closing the files as soon as the MappedByteBuffer is created, this only pushes the envelope so far and the fundamental issue is unresolved.
The LRU based design proposal I created earlier can only help mitigate the issue when there actually aren't as many concurrent writes for time series. In the case there are, this would cause a lot of cache evictions causing frequent cache swapping adding to degraded performance.
Proposal
The New Storage Engine design proposes to decouple compression and persistence responsibilities, combine multiple series into 1 file while keeping the concept of time series buckets. The whole design is based on a memory allocator that grants buffers to series buckets on request, these buffers are slices of a memory mapped file segment. Once the file reaches a certain size new files are created and existing file is closed. This redesign refactors a lot of components in the StorageEngine while preserving the interface as much as possible therefore there's minimal impact of writer and reader components of the database. Additional testing is added as well to help improve the reliability of the system.
Limit maximum number of open files.
Create an Least Recently Used based eviction system to automatically close data files that are not being written to or read from. Operating systems have limit on maximum number of open files, if a user / system exceeds that an exception is thrown that can't be recovered unless files are closed. The LRU based module will prevent this exception from being thrown by proactively limiting exceeding this limit. This feature is specially very helpful for series storing historical data.
Validate Ambari stack works for deployments and fix issues, provide documentation on how to use it.
Centos
Ubuntu
Provide init and installation scripts for ubuntu and centos
Disk storage engine doesn't garbage collect time series when they expire.
Compaction causes data corruption:
GRPC API currently has no authentication layer on it. This will be a request from users trying to run Sidewinder is secure environments.
Summary:
Add caching for queries
Description:
TBD
Cluster WAL doesn't write compressed data at the moment, compression reduces the size of the WAL and will improve the required disk space to operate the WAL.
This feature should support a pluggable compression algorithm for WAL byte compression. Essentially this feature should compress and uncompress the byte[] payload that is written and read from the WAL.
Note: remote operations shouldn't cause decompression of the data.
Sidewinder so far has been a single instance database with a placeholder for cluster. Without clustering linear scaling of this TSDB is difficult and has to be manually orchestrated.
Cluster should provide:
Dynamic addition and removal of nodes
This feature should allow Sidewinder instances to be added on the fly and removed on the fly with zero downtime with the relaxation of partially degraded performance during addition and removal of nodes due to intensive replication / data copying in the background.
Data replication
Each unique timeseries should be allowed to be replicated to multiple nodes to ensure fault tolerance i.e. Availability (CAP), the number of nodes a timeseries is replicated to should be determined by the global database level replication policy. At the very least the database must provide cluster level replication policy and optional db level replication granularity.
Data sharding
Sharding provides linear scaling, especially for a high throughput database like Sidewinder. Sharding should ideally be performed with a variant of ConsistentHashing to ensure AP of CAP
Request balancing
Database must have the capability that the clients do not need to be "smart" i.e. they do not need to be aware about the complexity of clustering internals of Sidewinder to ensure multiple types of clients can be easily plugged in to ingest data. Thus, the instances themselves should be able to proxy requests coming from clients to appropriate nodes this way all machines can evenly be hit by clients especially the HTTP interface can be front-ended using a round-robin load balancer.
Summary:
Add linearizability tests for Storage Engine.
Description:
Sidewinder, just like any other database supports concurrent access, which involves concurrency support for both reads and writes. It's important for the database to guarantee the linearizability property to ensure there are no concurrency bugs.
Tests should cover the following:
When PTR file is resized in the new memory mapped ptr file module the file gets corrupt, this causes failure of measurement recovery when Sidewinder is shutdown.
Grafana queries can't support regular expressions that do not allow more concise queries to be expressed for Value Field Name and Measurement Name.
Create a basic framework that can be expanded to provide sophisticated clustering for Sidewinder.As a part of this feature, create a clustering project and provide the following capabilities to it:
Node discovery mechanism
Node discovery allows machines to automatically discover nodes in a cluster allowing for simpler maintenance of the system using a seed machine. Providing multiple implementations for the same to allow end users to configure if they would like manual or automated discovery. Selecting Atomix project as the clustering engine / discovery engine for the same since it provides Raft based consensus and leader election in an embedded setup which can be extended for sharding and scaling features in future.
Cluster RPC mechanism
This provides simple framework for nodes to communicate with each other. GRPC is selected at the moment for this purpose that uses protocol specs written in protobuf (see next section) and HTTP for actual transport layer. Reason for selecting GRPC is to allow simplicity of transport mechanism and provide the ability to support authentication and authorization at the protocol layer without extensive effort. GRPC doesn't provide very high performance compared to other transports however the engineering effort, amortized time cost using batch calls justifies it's current use. A future enhancement would be to evaluate performance differences between GRPC and other transports to see if it's worth changing the system.
Internode protocols
Internode protocols are a more efficient mechanism for nodes to communicate with each other compared to the client facing interface. Internode protocols are written in Google Protobuf to allow simple definition of protocols that can later be translated to provide clients in other languages as well.
Basic data routing system
Basic routing system provides a way data points can be routed to multiple machines that are member of the Sidewinder cluster. The reason for supporting this is to allow simple, light weight clients the ability to leverage a Sidewinder cluster. This routing engine will provide ability for nodes to proxy requests from clients to the appropriate machines in the cluster.
Summary:
Basic GRPC Server for Ingestion and Queries based on protocol already established
Description:
GRPC is the binary protocol standard in Sidewinder for clustering, therefore it will be helpful to extend that to support binary protocol writes for clients. Additionally, queries should be supported via this as well.
Support freeform query syntax for Grafana and REST APIs to enabled templated dashboards.
Summary:
Create a connector to pull data into Apache Spark for Analytics
Description:
Sidewinder's data should be accessible to analytical tools like Spark for performing ML or batch analytics.
Tag key value pair separation has caused database performance impact and slowdown by about 20%, additionally memory utilization of Sidewinder needs to be reduced so Heap is freed to scale and store larger number of series.
Add logging configuration in RPM and init.d and etc
Summary:
Add SQL Support with JDBC Driver
Description:
Note: This functionality was temporarily disabled
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.