GithubHelp home page GithubHelp logo

manekl / run-vertexai-notebook Goto Github PK

View Code? Open in Web Editor NEW

This project forked from google-github-actions/run-vertexai-notebook

0.0 0.0 0.0 32 KB

GitHub Action for running Google Cloud Vertex AI notebooks asynchronously.

License: Apache License 2.0

run-vertexai-notebook's Introduction

run-vertexai-notebook GitHub Action

GitHub composite action to trigger asynchronous execution of a Jupyter Notebook via Google Cloud Vertex AI.

The typical SDLC for a Jupyter Notebook includes source control of the notebook file without it's output cells. It is a best practice that notebooks should be stored this way to prevent commiting potentially sensitive data. A downside of this practice is that code reviewers will not be able to see the output while reviewing and may not be able to accurately gauge the impact of a change.

The main purpose of this action is to provide a secure way to execute a notebook, store the output (outside of source control), and serve it to a reviewer with proper access controls.

This action relies on the notebook execution functionality of Google Cloud's Vertex AI to execute the notebook and store the executed notebook with output cells in Google Cloud Storage. Access to the output is controled by Google Cloud Storage ACLs.

NOTE: Notebooks executed by this action will fall under the notebook executor requirements defined by Vertex AI.

This action will provision cloud resources with associated costs so it is recommended that you control the usage of this action by:

  • Limiting the triggers of this action: e.g. on pull request with a specific label

  • Limiting the set of notebooks that it executes for via the allowlist parameter

  • Managing the size of the Vertex AI infrastructure via the vertex_machine_type parameter

Prerequisites

This action requires Google Cloud credentials to execute gcloud commands. See setup-gcloud for details.

Setup

  1. Create a new Google Cloud Project (or select an existing project) and enable the Vertex AI APIs.

  2. Create or reuse a GitHub repository for the example workflow:

    1. Create a repository.

    2. Move into the repository directory:

      $ cd <repo>
      
    3. Copy the example into the repository:

      $ cp -r <path_to>/notebook-review-action/examples/notebook-review/ .
      
  3. Create a GCS bucket if one does not already exist.

  4. Create a Google Cloud service account if one does not already exist.

  5. Add the following Cloud IAM roles to your service account:

    • roles/aiplatform.user - allows running jobs in Vertex AI
    • roles/storage.objectWriter - allows writing notebook files to object storage

    Note: These permissions are overly broad to favor a quick start. They do not represent best practices around the Principle of Least Privilege. To properly restrict access, you should create a custom IAM role with the most restrictive permissions.

  6. Setup authenticaion to Google Cloud using workload identity federation with the above service account.

Usage

 jobs:
   notebook-review:
    name: Notebook Review
    needs: changes
    runs-on: ubuntu-latest

    steps:
    - id: 'auth'
      name: 'Authenticate to Google Cloud'
      uses: 'google-github-actions/auth@v0'
      with:
        workload_identity_provider: 'projects/123456789/locations/global/workloadIdentityPools/my-pool/providers/my-provider'
        service_account: '[email protected]'

    - id: notebook-review
      uses: google-github-actions/run-vertexai-notebook@v0
        with:
          gcs_source_bucket: '${{ env.GCS_SOURCE }}'
          gcs_output_bucket: '${{ env.GCS_OUTPUT }}'
          allowlist: '${{ needs.changes.outputs.notebooks_files }}'

See a more complete example in examples.

Inputs

  • gcs_source_bucket - (Required) Google Cloud Storage bucket to store notebooks to be run by Vertex AI. e.g. mygcp-bucket-0001/nbr/source. This bucket was created during setup above.

  • gcs_output_bucket - (Required) Google Cloud Storage bucket to store the results of the notebooks executed by Vertex AI. e.g. mygcp-bucket-0001/nbr/output. This bucket was created during setup above.

    Note: It is recommended that the source and output values share the same bucket and utilize a path structure to seperate source from output.

  • region: (Optional) Google Cloud region to execute Vertex AI jobs in. Defaults to us-central1.

  • vertex_machine_type - (Optional) Machine type to use for Vertex AI job execution. Defaults to a n1-standard-4 machine shape.

  • allowlist - (Required) List of notebooks to execute. Comma separated list of files to run on Vertex AI. e.g. mynotebook.ipynb,somedir/another_notebook.pynb. It is expected that this is the output from an action like dorny/paths-filter.

run-vertexai-notebook's People

Contributors

bradegler avatar verbanicm avatar sethvargo avatar google-github-actions-bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.