GithubHelp home page GithubHelp logo

eventql / eventql Goto Github PK

View Code? Open in Web Editor NEW
1.2K 69.0 109.0 78.67 MB

Distributed "massively parallel" SQL query engine

Home Page: http://eventql.io/

License: Other

Makefile 0.96% Shell 0.10% HTML 11.23% M4 0.52% C++ 85.38% Protocol Buffer 0.47% JavaScript 0.31% C 0.75% CMake 0.08% Ruby 0.05% Rust 0.01% Python 0.13%
database sql timeseries mpp cpp cpp11 analytics columnar-storage streaming eventql

eventql's Introduction

EventQL

Build Status

EventQL is a distributed, columnar database built for large-scale data collection and analytics workloads. It can handle a large volume of streaming writes and runs super-fast SQL and MapReduce queries.

More information: Documentation, Download, Architecture, Getting Started

Features

This is a quick run-through of EventQL's key features to get you excited. For more detailed information on these topics and their caveats you are kindly referred to the documentation.

  • Automatic partitioning. Tables are transparently split into partitions using a primary key and distributed among many machines. You don't have to configure the number of shards upfront. Just insert your data and EventQL handles the rest.

  • Idempotent writes. Supports primary-key based INSERT, UPSERT and DELETE operations. You can use the UPSERT operation for easy exactly-once ingestion from streaming sources.

  • Compact, columnar storage. The columnar storage engine allows EventQL to drastically reduce its I/O footprint and execute analytical queries orders of magnitude faster than row-oriented systems.

  • Standard SQL support. (Almost) complete SQL 2009 support. (It does JOINs!) Queries are also automatically parallelized and executed on many machines in parallel

  • Scales to petabytes. EventQL distributes all table partitions and queries among a number of equally privileged servers. Given enough machines you can store and query thousands if terrabytes of data in a single table.

  • Streaming, low-latency operations. You don't have to batch-load data into EventQL - it can handle large volumes of streaming insert and update operations. All mutations are immediately visible and minimal SQL query latency is ~0.1ms.

  • Timeseries and relational data. The automatic partitioning supports timeseries as well as relational and key value data, as long as there is a good primary key. The storage engine also supports REPEATED and RECORD types so arbitrary JSON objects can be inserted into rows.

  • HTTP API. The HTTP API allows you to use query results in any application and easily send data from any application or device. EventQL also supports a native TCP-based protocol.

  • Fast range scans. Table partitions in EventQL are ordered and have a defined keyrange, so you can perform efficient range scans on parts of the keyspace.

  • Hardware efficient. EventQL is implemented in modern C++ and tries to achieve maximal performance on commodity hardware by using vectorized execution and SSE instructions.

  • Highly Available. The shared-nothing architecture of EventQL is highly fault tolerant. A cluster consists of many, equally privileged nodes and has no single point of failure.

  • Self-contained. You can set up a new cluster in minutes. The EventQL server ships as a single binary and has no external dependencies except Zookeeper or a similar coordination service.

Use Cases

Here are a few example scenarios that are particularly well suited to EventQL's design:

  • Storage and analysis of streaming event, timeseries or relational data
  • High volume event and sensor data logging
  • Joining and correlating of timeseries data with relational tables

Non-goals

Note that EventQL is built around specific design choices that make it an excellent fit for real-time data analytics processing (OLAP) tasks, but also mean it's not well suited for most transactional (OLTP) workloads.

Build

Before we can start we need to install some build dependencies. Currently you need a modern c++ compiler, libz, autotools and python (for spidermonkey/mozbuild)

# Ubuntu
$ apt-get install clang make automake autoconf libtool zlib1g-dev

# OSX
$ brew install automake autoconf libtool

To build EventQL from a distribution tarball:

$ ./configure
$ make
$ sudo make install

To build EventQL from a git checkout:

$ git clone [email protected]:eventql/eventql.git
$ cd eventql
$ ./autogen.sh
$ ./configure
$ make V=1
$ src/evql -h

To run the full (world) test suite:

$ make test

To run the quick (smoke) test suite:

$ make smoketest

eventql's People

Contributors

andoriyu avatar asmuth avatar asraelite avatar bartleusink avatar christianparpart avatar dolfly avatar earvinkayonga avatar jyrkiput avatar lauraschlimmer avatar pippijn avatar pvmsikrsna avatar simplepi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eventql's Issues

add --use-system-<lib> to configure script

the option should disable building of the bundled dependency and use the system-installed libraries instead.

currently the following libraries are bundled:

  • pcre
  • lmdb
  • linenoise
  • protobuf
  • spidermonkey (this will be a tricky one as the API changes a lot between releases. might skip it for now)
  • simdcomp (I don't think any distribution has a package for this one, so let's skip it for now)

implement unique table ids

  • add a table_id field to the TableConfig
  • default the field to the current table name
  • for new tables, initialize the field with a random string

chart/draw statement improvements

  • LINE/AREACHART -> force order by x column // order by x column in chart chart (!)
  • enforce series column existence if more than one select statement

misc QueryPlan{,Builder} interface cleanups

  • rename QueryPlanBuilder -> qtree_builder
  • QueryTree container ds w/ ownership of QTreeNodes
  • qtree interface -> rename statement to query
  • qtree->scheduler: pass queryfuture

improved /api/v1/sql interface

accept the following input encodings:

  • application/json -> normal JSON api request, reads a json message from the HTTP POST body that contains the sql query
  • application/sql -> read the SQL query text directly from the HTTP POST body

pipelined aggregation cleanups

  • move pipelined aggregation code/classes to server/sql/pipelined_aggregate.{h,c}
  • add proper PipelinedAggregationNode (vs. GroupByNode)

include in-memory cstables in lsm table scans

  • skipped column
  • use bitpacked for rlevel/dlevel column types
  • fix nextrepetitionlevel
  • proper page sizes
  • double, bitpacked columns
  • switch boolean default encoding back to bitpacked in tableschema

TableCursor::nextEncoded fastpath

add a fast path to tablecursor/binarycode that allows us to pass binary data from one table expression to an upstream expression without going through the SValue encoding/decoding code.

  • add an interface that allows the provider of a table cursor to return the next row as an encoded buffer instead of an svalue instance
  • add another interface that allows the table cursor consumer to request the next rows as an encoded buffer instead of an svalue reference
  • ensure we don't do any unnecessary coding in between

implement batch mode (-B) in evql

  • in batch mode, we don't buffer the query results in the evql client but output rows as they are returned from the database
  • in batch mode, we don't print any progress information
  • should behave similar to mysql -B, returns the rows in a CSV like format
  • TBD: how to handle escaping/encoding and exact csv format

finalize initial documentation

  • proper page titles (<title>) [also on index]
  • document create database in cluster setup
  • check&fix all links
  • ensure all emails live (hello,authors,sales@), all forms work
  • provide download links
  • cloud.eventql.io on new domain
  • docu analy.
  • favicon

implement table deletion

depends on: implement unique table ids (#13)

  • add a field deleted to the tableconfig
  • add an api call that sets the field to true (DeleteTable)
  • ignore deleted tables in scans/show tables/etc
  • implement re-create after delete (we need to overwrite the deleted table with a new table config but keep the version)

open questions: how/when is the actual data removed from disk?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.