GithubHelp home page GithubHelp logo

josesaribeiro / docker-airflow Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wmorin/docker-airflow-1

0.0 0.0 0.0 198 KB

Docker Apache Airflow

License: Apache License 2.0

Dockerfile 37.31% Python 20.55% Shell 42.13%

docker-airflow's Introduction

docker-airflow

CircleCI Docker Build Status

Docker Hub Docker Pulls Docker Stars

This repository contains Dockerfile of apache-airflow for Docker's automated build published to the public Docker Hub Registry.

Informations

Installation

Pull the image from the Docker repository.

docker pull puckel/docker-airflow

Build

Optionally install Extra Airflow Packages and/or python dependencies at build time :

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" -t puckel/docker-airflow .
docker build --rm --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .

or combined

docker build --rm --build-arg AIRFLOW_DEPS="datadog,dask" --build-arg PYTHON_DEPS="flask_oauthlib>=0.9" -t puckel/docker-airflow .

Don't forget to update the airflow images in the docker-compose files to puckel/docker-airflow:latest.

Usage

By default, docker-airflow runs Airflow with SequentialExecutor :

docker run -d -p 8080:8080 puckel/docker-airflow webserver

If you want to run another executor, use the other docker-compose.yml files provided in this repository.

For LocalExecutor :

docker-compose -f docker-compose-LocalExecutor.yml up -d

For CeleryExecutor :

docker-compose -f docker-compose-CeleryExecutor.yml up -d

NB : If you want to have DAGs example loaded (default=False), you've to set the following environment variable :

LOAD_EX=n

docker run -d -p 8080:8080 -e LOAD_EX=y puckel/docker-airflow

If you want to use Ad hoc query, make sure you've configured connections: Go to Admin -> Connections and Edit "postgres_default" set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :

  • Host : postgres
  • Schema : airflow
  • Login : airflow
  • Password : airflow

For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :

docker run puckel/docker-airflow python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)"

Configurating Airflow

It's possible to set any configuration value for Airflow from environment variables, which are used over values from the airflow.cfg.

The general rule is the environment variable should be named AIRFLOW__<section>__<key>, for example AIRFLOW__CORE__SQL_ALCHEMY_CONN sets the sql_alchemy_conn config option in the [core] section.

Check out the Airflow documentation for more details

You can also define connections via environment variables by prefixing them with AIRFLOW_CONN_ - for example AIRFLOW_CONN_POSTGRES_MASTER=postgres://user:password@localhost:5432/master for a connection called "postgres_master". The value is parsed as a URI. This will work for hooks etc, but won't show up in the "Ad-hoc Query" section unless an (empty) connection is also created in the DB

Custom Airflow plugins

Airflow allows for custom user-created plugins which are typically found in ${AIRFLOW_HOME}/plugins folder. Documentation on plugins can be found here

In order to incorporate plugins into your docker container

  • Create the plugins folders plugins/ with your custom plugins.
  • Mount the folder as a volume by doing either of the following:
    • Include the folder as a volume in command-line -v $(pwd)/plugins/:/usr/local/airflow/plugins
    • Use docker-compose-LocalExecutor.yml or docker-compose-CeleryExecutor.yml which contains support for adding the plugins folder as a volume

Install custom python package

  • Create a file "requirements.txt" with the desired python modules
  • Mount this file as a volume -v $(pwd)/requirements.txt:/requirements.txt (or add it as a volume in docker-compose file)
  • The entrypoint.sh script execute the pip install command (with --user option)

UI Links

Scale the number of workers

Easy scaling using docker-compose:

docker-compose -f docker-compose-CeleryExecutor.yml scale worker=5

This can be used to scale to a multi node setup using docker swarm.

Running other airflow commands

If you want to run other airflow sub-commands, such as list_dags or clear you can do so like this:

docker run --rm -ti puckel/docker-airflow airflow list_dags

or with your docker-compose set up like this:

docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags

You can also use this to run a bash shell or any other command in the same environment that airflow would be run in:

docker run --rm -ti puckel/docker-airflow bash
docker run --rm -ti puckel/docker-airflow ipython

Wanna help?

Fork, improve and PR. ;-)

docker-airflow's People

Contributors

puckel avatar ashb avatar diraol avatar darkk avatar ianburrell avatar jmcarp avatar ajhodges avatar kaxil avatar medmrgh avatar theodoresiu avatar adamunger avatar achm6174 avatar arihantsurana avatar odannyc avatar edrzmr avatar javioverflow avatar jbdalido avatar kristi avatar mhousley avatar maxcountryman avatar lamroger avatar ttaschke avatar eshizhan avatar hadsed-genapsys avatar mendhak avatar msn1444 avatar ryanrussell avatar rootcss avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.