GithubHelp home page GithubHelp logo

jgraham0325 / databricks-bundles-ci-example Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.08 MB

Example of using Databricks Asset Bundles with CI/CD in Azure Devops

License: Other

Jupyter Notebook 52.72% Python 47.28%

databricks-bundles-ci-example's Introduction

bundle-examples

This repository provides Databricks Asset Bundles examples, with added CI/CD pipelines using Azure Devops

For more details, see the READMEs in each subfolder, e.g. default_python README

To learn more, see:

CI/CD Process flow:

CI/CD Process Flow diagram

Source Diagram

CI pipeline definitions are in: default_python/azure_devops_pipelines/

Description:

  1. Engineer create a new branch using their IDE of choice, or the Databricks Repos UI
  2. Engineer makes code changes, runs unit tests locally and integration tests against the dev workspace
  3. Once the feature is complete, the engineer creates a Pull Request in Azure Devops Repos
  4. Azure Devops Pipelines automatically triggers a run of the Pull Requests CI pipeline. This runs tests and deploys the Databricks Asset Bundle (DAB) to the development environment
  5. Once all automated checks have been completed and pull request has been approved, the engineer completes the pull request to merge the code into the main branch
  6. Azure Devops Pipelines automatically triggers a run of the Staging CI pipeline. This tests the code again, but against the Staging environment, which likely has more realistic production data and config. It deploys the DAB to the environment
  7. (Optional) The engineer wants to release the code to the production environment, so creates a new Pull Request to merge all the code from the main branch into the release branch
  8. Once the Pull Request has been reviewed, it is completed and the code is merged into the release branch
  9. Azure Devopis Pipelines automatically triggers the Prod CI pipeline, deploying the DAB to production, this typically includes a scheduled trigger for the job to run at a given time

Creating a service principle to use with CI process

  1. Create a service principle in Azure Entra ID or in Databricks directly if you don't have SCIM set up. See https://learn.microsoft.com/en-us/azure/databricks/admin/users-groups/service-principals

Alternatively, you can use a personal access token from Databricks instead. Change the environment variables in the pipeline files and Azure Devops variable group appropriately.

Setting up environment specific variables/secrets

  1. Go to Azure Pipelines
  2. Click 'Library'
  3. Create new variable group
  4. Name it dev-variable-group
  5. Add the following variables:
    • BUNDLE_VAR_notifications_email_address : Optional email address to use for failure notifications
    • DATABRICKS_CLIENT_ID : Service Principle client id used to authenticate with Databricks
    • DATABRICKS_CLIENT_SECRET : Service Principle secret use to authenticate with Databricks. Set this to secret to avoid it displaying in the UI
    • DATABRICKS_CLUSTER_ID : Used by DBConnect to run automated tests against Databricks interactive cluster
    • DATABRICKS_HOST : Databricks host used by CLI and tests, e.g. https://demo-workspace.cloud.databricks.com/
  6. Clone this variable group for staging and prod, call these staging-variable-group and prod-variable-group. Change values accordingly.

Setting up Azure Pipelines

  1. Go to Azure Pipelines
  2. Click 'New Pipeline'
  3. Select Azure Repos Git and then this Git repo
  4. Select Existing Azure Pipelines YAML file
  5. Select main branch and default_python/azure_devops_pipelines/azure_pipeline_pull_request.yml
  6. Run pipeline
  7. Repeat steps for the staging and production CI pipelines

databricks-bundles-ci-example's People

Contributors

jgraham0325databricks avatar lennartkats-db avatar pietern avatar andrewnester avatar

Watchers

James Graham avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.