GithubHelp home page GithubHelp logo

apache / airflow-on-k8s-operator Goto Github PK

View Code? Open in Web Editor NEW
89.0 30.0 26.0 34.63 MB

Airflow on Kubernetes Operator

Home Page: https://airflow.apache.org/

License: Apache License 2.0

Dockerfile 1.26% Makefile 3.29% Go 95.45%

airflow-on-k8s-operator's Introduction

Airflow On K8S Operator

Airflow k8s operator Go Report Card

Community

Project Status

Alpha

The Airflow Operator is still under active development and has not been extensively tested in production environment. Backward compatibility of the APIs is not guaranteed for alpha releases.

Prerequisites

  • Version >= 1.9 of Kubernetes.
  • Uses 1.9 of Airflow (1.10.1+ for k8s executor)
  • Uses 4.0.x of Redis (for celery operator)
  • Uses 5.7 of MySQL

Get Started

One Click Deployment from Google Cloud Marketplace to your GKE cluster

Get started quickly with the Airflow Operator using the Quick Start Guide

For more information check the Design and detailed User Guide

Airflow Operator Overview

Airflow Operator is a custom Kubernetes operator that makes it easy to deploy and manage Apache Airflow on Kubernetes. Apache Airflow is a platform to programmatically author, schedule and monitor workflows. Using the Airflow Operator, an Airflow cluster is split into 2 parts represented by the AirflowBase and AirflowCluster custom resources. The Airflow Operator performs these jobs:

  • Creates and manages the necessary Kubernetes resources for an Airflow deployment.
  • Updates the corresponding Kubernetes resources when the AirflowBase or AirflowCluster specification changes.
  • Restores managed Kubernetes resources that are deleted.
  • Supports creation of Airflow schedulers with different Executors
  • Supports sharing of the AirflowBase across mulitple AirflowClusters

Checkout out the Design

Airflow Cluster

Development

Refer to the Design and Development Guide.

History

This repo has been donated to Apache foundation. It was originally developed here at GoogleCloud repo

airflow-on-k8s-operator's People

Contributors

aijamalnk avatar barney-s avatar kaxil avatar turbaszek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

airflow-on-k8s-operator's Issues

Test and E2E test are failing

I was trying to make docker-build, but since test is failing docker build is failing too.

Since tests are not working, I am not able to build docker container and test the operator.

make e2e-test is failing

make e2e-test

Your active configuration is: [rahul]
kubectl get namespace airflowop-system || kubectl create namespace airflowop-system
Error from server (NotFound): namespaces "airflowop-system" not found
namespace/airflowop-system created
go test -v -timeout 20m test/e2e/base/base_test.go --namespace airflowop-system
=== RUN   Test
Running Suite: AirflowBase Suite
================================
Random Seed: 1583520410
Will run 2 of 2 specs

STEP: creating a new AirflowBase: 
• Failure [0.289 seconds]
AirflowBase controller tests
/Users/rahul/airflow-on-k8s-operator/test/e2e/base/base_test.go:70
  creating a AirflowBase with mysql [It]
  /Users/rahul/airflow-on-k8s-operator/test/e2e/base/base_test.go:76

  failed to create CR : AirflowBase.airflow.apache.org "" is invalid: metadata.name: Required value: name or generateName is required
  Unexpected error:
      <*errors.StatusError | 0xc00023e1e0>: {
          ErrStatus: {
              TypeMeta: {Kind: "", APIVersion: ""},
              ListMeta: {
                  SelfLink: "",
                  ResourceVersion: "",
                  Continue: "",
                  RemainingItemCount: nil,
              },
              Status: "Failure",
              Message: "AirflowBase.airflow.apache.org \"\" is invalid: metadata.name: Required value: name or generateName is required",
              Reason: "Invalid",
              Details: {
                  Name: "",
                  Group: "airflow.apache.org",
                  Kind: "AirflowBase",
                  UID: "",
                  Causes: [
                      {
                          Type: "FieldValueRequired",
                          Message: "Required value: name or generateName is required",
                          Field: "metadata.name",
                      },
                  ],
                  RetryAfterSeconds: 0,
              },
              Code: 422,
          },
      }
      AirflowBase.airflow.apache.org "" is invalid: metadata.name: Required value: name or generateName is required
  occurred

  /Users/rahul/airflow-on-k8s-operator/vendor/sigs.k8s.io/controller-reconciler/pkg/test/framework.go:178
------------------------------
STEP: creating a new AirflowBase: 
• Failure [0.094 seconds]
AirflowBase controller tests
/Users/rahul/airflow-on-k8s-operator/test/e2e/base/base_test.go:70
  creating a AirflowBase with postgres [It]
  /Users/rahul/airflow-on-k8s-operator/test/e2e/base/base_test.go:88

  failed to create CR : AirflowBase.airflow.apache.org "" is invalid: metadata.name: Required value: name or generateName is required
  Unexpected error:
      <*errors.StatusError | 0xc00023ebe0>: {
          ErrStatus: {
              TypeMeta: {Kind: "", APIVersion: ""},
              ListMeta: {
                  SelfLink: "",
                  ResourceVersion: "",
                  Continue: "",
                  RemainingItemCount: nil,
              },
              Status: "Failure",
              Message: "AirflowBase.airflow.apache.org \"\" is invalid: metadata.name: Required value: name or generateName is required",
              Reason: "Invalid",
              Details: {
                  Name: "",
                  Group: "airflow.apache.org",
                  Kind: "AirflowBase",
                  UID: "",
                  Causes: [
                      {
                          Type: "FieldValueRequired",
                          Message: "Required value: name or generateName is required",
                          Field: "metadata.name",
                      },
                  ],
                  RetryAfterSeconds: 0,
              },
              Code: 422,
          },
      }
      AirflowBase.airflow.apache.org "" is invalid: metadata.name: Required value: name or generateName is required
  occurred

  /Users/rahul/airflow-on-k8s-operator/vendor/sigs.k8s.io/controller-reconciler/pkg/test/framework.go:178
------------------------------
Failure [0.000 seconds]
[AfterSuite] AfterSuite 
/Users/rahul/airflow-on-k8s-operator/test/e2e/base/base_test.go:56

  failed to delete CR : resource name may not be empty
  Unexpected error:
      <*errors.errorString | 0xc000261260>: {
          s: "resource name may not be empty",
      }
      resource name may not be empty
  occurred

  /Users/rahul/airflow-on-k8s-operator/vendor/sigs.k8s.io/controller-reconciler/pkg/test/framework.go:132
------------------------------


Summarizing 2 Failures:

[Fail] AirflowBase controller tests [It] creating a AirflowBase with mysql 
/Users/rahul/airflow-on-k8s-operator/vendor/sigs.k8s.io/controller-reconciler/pkg/test/framework.go:178

[Fail] AirflowBase controller tests [It] creating a AirflowBase with postgres 
/Users/rahul/airflow-on-k8s-operator/vendor/sigs.k8s.io/controller-reconciler/pkg/test/framework.go:178

Ran 2 of 2 Specs in 4.759 seconds
FAIL! -- 0 Passed | 2 Failed | 0 Pending | 0 Skipped
--- FAIL: Test (4.76s)
FAIL
FAIL    command-line-arguments  5.610s
FAIL
make: *** [e2e-test] Error 1

Ignoring test, I tried to build the contianer image then got following error:

Sending build context to Docker daemon  82.96MB
Step 1/14 : FROM golang:1.13 as builder
1.13: Pulling from library/golang
50e431f79093: Pull complete 
dd8c6d374ea5: Pull complete 
c85513200d84: Pull complete 
55769680e827: Pull complete 
15357f5e50c4: Pull complete 
e2d9b328fba5: Pull complete 
f8e0159fc852: Pull complete 
Digest: sha256:43f859b58af8c84c8aef288809204cfbd7cb88dbd4b0cf473dd4fb86693403ad
Status: Downloaded newer image for golang:1.13
 ---> 3a7408f53f79
Step 2/14 : WORKDIR /workspace
 ---> Running in 0cef2decd14f
Removing intermediate container 0cef2decd14f
 ---> 0db260c9e127
Step 3/14 : COPY go.mod go.mod
 ---> 326f43338f87
Step 4/14 : COPY go.sum go.sum
 ---> b5ddeaf773e0
Step 5/14 : RUN go mod download
 ---> Running in 6b1a0d9e30b0
go: sigs.k8s.io/[email protected]: parsing vendor/sigs.k8s.io/controller-reconciler/go.mod: open /workspace/vendor/sigs.k8s.io/controller-reconciler/go.mod: no such file or directory
The command '/bin/sh -c go mod download' returned a non-zero code: 1
make: *** [docker-build] Error 1

make deploy doesn't work

➜ make deploy
(unset)
/Users/tomaszurbaszek/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
cd config/manager && kustomize edit set image controller=gcr.io/polidea-airflow/airflow-operator:a3c3869
panic: runtime error: index out of range [-1]

goroutine 1 [running]:
sigs.k8s.io/kustomize/kustomize/v3/internal/commands/kustfile.(*kustomizationFile).parseCommentedFields(0xc0004b6ac0, 0xc00013c000, 0x32d, 0x380, 0x0, 0x0)
	/private/tmp/kustomize-20200117-60956-18qh1a4/kustomize/internal/commands/kustfile/kustomizationfile.go:200 +0x7b9
sigs.k8s.io/kustomize/kustomize/v3/internal/commands/kustfile.(*kustomizationFile).Read(0xc0004b6ac0, 0x6278b80, 0xc0004b6ac0, 0x0)
	/private/tmp/kustomize-20200117-60956-18qh1a4/kustomize/internal/commands/kustfile/kustomizationfile.go:158 +0x2b0
sigs.k8s.io/kustomize/kustomize/v3/internal/commands/edit/set.(*setImageOptions).RunSetImage(0xc0004ba518, 0x55ef820, 0x6278b80, 0x1, 0x0)
	/private/tmp/kustomize-20200117-60956-18qh1a4/kustomize/internal/commands/edit/set/setimage.go:119 +0x76
sigs.k8s.io/kustomize/kustomize/v3/internal/commands/edit/set.newCmdSetImage.func1(0xc0004daa00, 0xc0003591e0, 0x1, 0x1, 0x0, 0x0)
	/private/tmp/kustomize-20200117-60956-18qh1a4/kustomize/internal/commands/edit/set/setimage.go:82 +0xab
github.com/spf13/cobra.(*Command).execute(0xc0004daa00, 0xc0003591b0, 0x1, 0x1, 0xc0004daa00, 0xc0003591b0)
	/private/tmp/kustomize-20200117-60956-18qh1a4/.brew_home/go/pkg/mod/github.com/spf13/[email protected]/command.go:826 +0x460
github.com/spf13/cobra.(*Command).ExecuteC(0xc000383b80, 0x9, 0x55d2100, 0xc0003e9460)
	/private/tmp/kustomize-20200117-60956-18qh1a4/.brew_home/go/pkg/mod/github.com/spf13/[email protected]/command.go:914 +0x2fb
github.com/spf13/cobra.(*Command).Execute(...)
	/private/tmp/kustomize-20200117-60956-18qh1a4/.brew_home/go/pkg/mod/github.com/spf13/[email protected]/command.go:864
main.main()
	/private/tmp/kustomize-20200117-60956-18qh1a4/kustomize/main.go:23 +0x71
make: *** [deploy] Error 2

PDB prevents node draining

The PDB's have minAvailable set to 100%. As per the docs the means:

you are requiring zero voluntary evictions. When you set zero voluntary evictions for a workload object such as ReplicaSet, then you cannot successfully drain a Node running one of those Pods. If you try to drain a Node where an unevictable Pod is running, the drain never completes.

I don't think this is desirable!

gcr.io image ImagePullBackOff

Hi,

While following the quickstart (or development.md) and deploying the CR, there's an ImagePullBackOff error (Back-off pulling image "gcr.io/airflow-operator/airflow:1.10.2") on an AWS KOPS cluster, with the operator running locally & also on the cluster.

Just FYI there were a couple of other errors too prior to the above, such as:
(1)
RBAC related (Required value: resource rules must supply at least one api group) due to apiGroup not specified under Rules for Role and ClusterRole in the YAMLs.
This was fixed by including the apiGroups under "rules" for these.
(2)
Object 'Kind' is missing (for all the YAMLs under template folder due to "---" at the start).
This was fixed by deleting the line with "---".

With above fixes it was able to proceed further and "make run" (operator running locally for instance) is fine and the mysql-celery base CR is also created successfully.
But while trying to create the mysql-celery cluster CR, the pods fail with ImagePullBackOff.

Also just a simple "docker pull gcr.io/airflow-operator/airflow:1.10.2" too results in:
Error response from daemon: Get https://gcr.io/v2/airflow-operator/airflow/manifests/1.10.2: unknown: Project 'project:airflow-operator' not found or deleted.

Trying latest version (i.e. without specifying 1.10.2) etc too fails.
Has the image path (or version) changed ?

Thanks.

Potential import collision: import path should be "sigs.k8s.io/application", not "github.com/kubernetes-sigs/application".

Background

The application has already renamed it’s import path from "github.com/kubernetes-sigs/application" to "sigs.k8s.io/application" in version v0.8.1.

But apache/airflow-on-k8s-operator still used the old path:
https://github.com/apache/airflow-on-k8s-operator/blob/master/go.mod#L8

github.com/kubernetes-sigs/application v0.8.1 

When you use the old path "github.com/kubernetes-sigs/application" to import the application, it will be very easy to reintroduce application through the import statements "import sigs.k8s.io/application" in the go source file of application.
https://github.com/kubernetes-sigs/application/blob/v0.8.1/controllers/application_controller.go#L24

package controllers
import (
	…
	appv1beta1 "sigs.k8s.io/application/api/v1beta1"
)
…

The "sigs.k8s.io/application" and "github.com/kubernetes-sigs/application" are the same repos. This will work in isolation, bring about potential risks and problems.

So, why not get rid of the old import path "github.com/kubernetes-sigs/application", use "sigs.k8s.io/application" instead.

Solution

Replace all the old import paths, change "github.com/kubernetes-sigs/application" to "sigs.k8s.io/application".
Where did you import it: https://github.com/apache/airflow-on-k8s-operator/search?q=kubernetes-sigs%2Fapplication&unscoped_q=kubernetes-sigs%2Fapplication

issue while installation of airflow on kubernetes 1.19 node

error: error validating "airflow.yaml": error validating data: ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec; if you choose to ignore these errors, turn validation off with --validate=false

can you please help me in resolving issue??

Unable to deploy a AirflowCluster

I've followed the quick start guide and eventually I've been able to deploy the operator by modifying some kustomize resources manually (the image) and the role.

I've then tried to deploy a postgres airflow cluster using the hack/postgres-celery/base.yml and hack/postgres-k8s/cluster.yml but without success. The operator keeps throwing the following error:

2021/01/27 00:05:26 *v1alpha1.AirflowBase/default/pc-base(cmpnt:*controllers.Postgres)  { reconciling component
E0127 00:05:26.163775       1 genericreconciler.go:53] Failed: [*v1alpha1.AirflowBase/default/pc-base] gathering expected resources. open templates/svc.yaml: no such file or directory
2021/01/27 00:05:26 *v1alpha1.AirflowBase/default/pc-base(cmpnt:*controllers.Postgres)  } reconciling component

2021/01/27 00:06:00 *v1alpha1.AirflowCluster/default/pk-cluster(cmpnt:*controllers.UI)  } reconciling component
E0127 00:06:00.985761       1 genericreconciler.go:116] error reconciling *v1alpha1.AirflowCluster/default/pk-cluster. open templates/secret.yaml: no such file or directory

I'm not sure why but it looks like the controller is trying the fetch the files from the templates directory but it's inexistant.

GenericReconciler generates ownerRefernces incorrectly

Hi, we (@VedantMahabaleshwarkar and I) are trying to use the operator on OpenShift 4 and getting errors which led us to GenericReconciler being the source:

statefulsets.apps "pc-base-postgres" is forbidden: cannot set blockOwnerDeletion
      in this case because cannot find RESTMapping for APIVersion airflow.apache.org/v1alpha1
      Kind *v1alpha1.AirflowBase: no matches for kind "*v1alpha1.AirflowBase" in version
      "airflow.apache.org/v1alpha1", secrets "pc-base-sql" is forbidden: cannot set
      blockOwnerDeletion in this case because cannot find RESTMapping for APIVersion
      airflow.apache.org/v1alpha1 Kind *v1alpha1.AirflowBase: no matches for kind
      "*v1alpha1.AirflowBase" in version "airflow.apache.org/v1alpha1"

Basically, the error explains that there is no Kind with name *v1alpha1.AirflowBase - which makes sense, because the Kind name is Airflowbase.

I was able to workaround this by either turning off blockOwnerDeletion (which disables k8s garbage collection, no good) or by changing the Kind name creation - see vpavlin@274c347

Now, I don't think this is a good solution since it is changing a dependency of the operator code, but at the same time, the actual GenericReconciler code is gone from Github (or I cannot find it).

It would be probably good to get rid of it entirely - is there a plan for this?

Also, see this other commit: vpavlin@881058b

Those are the changes we had to make to get things actually running at all - it seems like current master is broken - are you aware of these issues? Are we missing something?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.