Learn Druid

The "Learn Druid" repository contains many resources to help you learn and apply Apache Druid.

It contains:

Jupyter Notebooks that guide you through query, ingestion, and data management with Apache Druid.
A Docker Compose file to get you up and running with a learning lab.

Pre-requisites

To use the "Learn Druid" Docker Compose, you need:

Git or Github Desktop
Docker Desktop with Docker Compose
A machine with at least 6 GiB of RAM.

Of course, more power is better. The notebooks have been tested with the following resources available to docker: 6 CPUs, 8GB of RAM, and 1 GB swap.

Quickstart

To get started quickly:

Clone this repository locally, if you have not already done so:
```
git clone https://github.com/implydata/learn-druid
```
Navigate to the directory:
```
 cd learn-druid
```

To refresh your local copy with the latest notebooks:
git restore .
git pull

Launch the "Learn Druid" Docker environment:
```
docker compose --profile druid-jupyter up -d
```
The first time you lanch the environment, it can take a while to start all the services.
Navigate to Jupyter Lab in your browser:

http://localhost:8889/lab

From there you can read the introduction or use Jupyter Lab to navigate the notebooks folder.

Components

The Learn Druid environment Docker Compose file includes the following services:

Jupyter Lab: An interactive environment to run Jupyter Notebooks. The image for Jupyter used in the environment contains Python along with all the supporting libraries you need to run the notebooks.

Jupyter Labs is exposed at:

http://localhost:8889/

Apache Kafka: Streaming service as a data source for Druid.

Imply Data Generator: A tool to generate sample data for Druid. It can produce either batch or streaming data.

Apache Druid: The currently released version of Apache Druid by default.

You can use the web console to monitor ingestion tasks, compare query results, and more. To learn about the Druid web console, see Web console.

The Druid web console is exposed at:

http://localhost:8888

Profiles

You can use the following Docker Compose profiles to start various combinations of the components based upon your specific needs.

Individual notebooks may prescribe a specific profile that you need to use.

Jupyter only

Use this profile when you want to run the notebooks against an existing Apache Druid database. Use the DRUID_HOST parameter to set the Apache Druid host address.

To start Jupyter only:

DRUID_HOST=[host address] docker compose --profile jupyter up -d

For example, if Druid is running on the local machine:

DRUID_HOST=host.docker.internal docker compose --profile jupyter up -d

To stop Jupyter:

docker compose --profile jupyter down

Jupyter and Druid

Use this profile when you need to query data and do batch ingestion only.

To start Jupyter and Druid:

docker compose --profile druid-jupyter up -d

To stop Jupyter and Druid:

docker compose --profile druid-jupyter down

All services

To start all services:

docker compose --profile all-services up -d

To stop all services:

docker compose --profile all-services down

Feedback and help

For feedback and help, start a discussion in the Discussions board or make contact in the docs and training channel in Apache Druid Slack.

petermarshallio / learn-druid Goto Github PK

learn-druid's Introduction

Learn Druid

Pre-requisites

Quickstart

Components

Profiles

Jupyter only

Jupyter and Druid

All services

Feedback and help

learn-druid's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs