GithubHelp home page GithubHelp logo

isabella232 / notebooks-blueprint-security Goto Github PK

View Code? Open in Web Editor NEW

This project forked from googlecloudplatform/notebooks-blueprint-security

0.0 0.0 0.0 229 KB

License: Apache License 2.0

HCL 60.14% Shell 15.77% Ruby 18.61% Makefile 5.49%

notebooks-blueprint-security's Introduction

AI Platform Notebook Security Blueprint: Protecting PII Data

This repository provides an opinionated way to set up AI Platform Notebook in a secure way using Terraform.

This is not an officially supported Google product

Reference Architecture

Reference Architecture

The resources that this module will create are:

  • One AI Platform Notebook per Notebook user
  • Service Account for Notebooks
  • an HSM key used for Customer Managed Encryption Keys (CMEK) in each Notebook
  • Custom Role to restrict exporting data
  • Google Cloud Storage bucket with bootstrap code for Notebooks
  • Org Policies at the folder that the trusted-data project is in
    • constraints/gcp.resourceLocations
    • constraints/iam.disableServiceAccountCreation
    • constraints/iam.disableServiceAccountKeyCreation
    • constraints/iam.automaticIamGrantsForDefaultServiceAccounts
    • constraints/compute.requireOsLogin
    • constraints/compute.restrictProtocolForwardingCreationForTypes
    • constraints/compute.restrictSharedVpcSubnetworks

Assumptions

  • You have your Project and network configuration available for where you want to deploy your trusted environment.
  • You have the appropriate IAM permissions to configure project resources (see service account roles).
  • You have an IAM Group and list of identities that is allowed to access the trusted environment.
  • You are familiar with your organization's security best practices and policies. Learn about Google Cloud security foundation best practices by reading the security foundation blueprint.

Prerequisites

Prepare your admin workstation

You can use Cloud Shell, a local machine or VM as your admin workstation

Tools for Cloud Shell as your Admin workstation

Tools for a local workstation as your Admin workstation

Installation instructions for Tools for your environment

Install Cloud SDK

This is pre installed if you are using Cloud Shell

The Google Cloud SDK is used to interact with your GCP resources. Installation instructions for multiple platforms are available online.

Install Terraform

Terraform is used to automate the manipulation of cloud infrastructure. Its installation instructions are also available online. When configuring terraform for use with Google cloud create a service account as detailed in Getting started with the google provider

Authentication

After installing the gcloud SDK run gcloud init to set up the gcloud cli. When executing choose the correct region and zone

'gcloud init'

Ensure you are using the correct project . Replace my-project-name with the name of your project

Where the project name is my-project-name

gcloud config set project my-project-name

Compatibility

This module is meant for use with Terraform 0.12. Learn how to upgraded to the required version.

Usage

Basic usage of this module is as follows:

module "notebooks_blueprint_security" {
  source  = "GoogleCloudPlatform/notebooks-blueprint-security/google"

  vpc_perimeter_regions           = ["US", "DE"]
  vpc_perimeter_policy_name       = "higher_trust_perimeter_policy"
  vpc_perimeter_ip_subnetworks    = ["NETWORK_CIDR"]  # allowed to access VPC-SC perimeters
  zone                            = "us-central1-a"
  resource_locations              = ["in:us-locations", "in:eu-locations"]
  notebook_key_name               = "trusted-data-key"
  dataset_id                      = "sample_ds_for_notebooks"
  notebook_name_prefix            = "trusted-sample"
  bootstrap_notebooks_bucket_name = "notebook_bootstrap"
  default_policy_id               = "12345678"  # likely org id
  project_trusted_analytics       = "trusted-analytics"
  project_trusted_data            = "trusted-data"
  project_trusted_kms             = "trusted-kms"
  trusted_private_network         = "projects/<shared-restricted-prj>/global/networks/<your_vpc>"
  trusted_private_subnet          = "projects/<shared-restricted-prj>/regions/<region>/subnetworks/<your_subnets_for_notebooks>"
  caip_users                      = ["[email protected]", "[email protected]"]
  confid_users                    = ["group:[email protected]", "group:[email protected]"]
  trusted_scientists              = ["user:[email protected]", "user:[email protected]"]
}
  1. Create a tfvars file with the required inputs (see Inputs section below)

  2. terraform init to get the plugins

  3. terraform plan -var-file="YOUR_FILE.tfvars" to see the infrastructure plan. Note: Replace YOUR_file with the name of your tfvars file from the first step

  4. terraform apply -var-file="YOUR_FILE.tfvars" to apply the infrastructure build. Note: Replace YOUR_file with the name of your tfvars file from the first step

  5. Access your AI Platform Notebook

    • establish an SSH tunnel from your device to your AI Platform Notebook
    • in your browser, visit http://localhost:8080 to access your AI Platform Notebook

Be sure to specify your PROJECT_ID, DATASET, and TABLE below, which should match your terraform.tfvars file.

%%bigquery
SELECT
  *
FROM `PROJECT_ID.DATASET.TABLE`
LIMIT 10
  1. terraform destroy -var-file="YOUR_FILE.tfvars" to destroy the built infrastructure. Note: Replace YOUR_file with the name of your tfvars file from the first step

Adding identities to groups

  1. You may need to add service accounts the appropriate IAM high trust data scientist group.
# please change the values below to your specific values
gcloud identity groups memberships add --group-email [email protected] --member-email = sa-p-notebook-compute@<proj>.iam.gserviceaccount.com

Accessing Notebooks

Use ssh to access your notebook. Notebooks have no external IP and users should not impersonate the Notebook service account. Learn how to open an ssh tunnel to launch JuptyerLab, by reading the SSH to access JupyterLab article.

Functional examples are included in the examples directory.

Inputs

Name Description Type Default Required
trusted_private_network The URI of the private network where you want your Notebooks. This would be the restricted_network_self_link from the foundational security blueprint terraform string "" yes
trusted_private_subnet The URI of the private subnet where you want your Notebooks. This would be the restricted_subnets_self_link from the foundational security blueprint terraform string "" yes
default_policy_id The parent of this AccessPolicy in the Cloud Resource Hierarchy. As of now, only organization IDs are accepted as parent. string "" yes
vpc_perimeter_policy_name The perimeter policy's name. string "" yes
vpc_perimeter_ip_subnetworks IP subnets allowed to access the higher trust perimeters. list(string) [] yes
vpc_perimeter_regions 2 letter identifier for regions allowed for VPC access. A valid ISO 3166-1 alpha-2 code. list(string) [] yes
project_trusted_analytics Project that holds Notebooks string "" yes
project_trusted_data Project that holds data used Notebook string "" yes
project_trusted_kms Project that holds KMS keys used to protect PII data for Notebooks string "" yes
resource_locations Regions where resource can be provisioned list(string) [] yes
vpc_subnets_projects_allowed list of projects with allowed vpc subnets for the notebooks; defined with the under constraint format (e.g. ["under:projects/project_id1", "under:projects/project_id2"]) list(string) [] yes
notebook_key_name name to use to create a KMS/HSM key that protects pii data string "" yes
caip_users The list of users that need an AI Platform Notebook (list of emails). list(string) [] yes
trusted_scientists The list of trusted scientists (in the form of user:[email protected]) list(string) [] yes
confid_users The list of groups with privileged users that can access PII data. (ex: [email protected]) list(string) [] yes
dataset_id BigQuery dataset ID with PII data that scientists need access string "" yes
notebook_name_prefix Prefix used in provisioning Notebooks in the higher trust boundary. string "trusted-sample" no

Outputs

Name Description
none none

Requirements

These sections describe requirements for using this module.

Software

The following dependencies must be available:

Service Account

A service account with the following roles must be used to provision the resources of this module:

Organization Level

  • Access Context Manager Policy Admin: roles/accesscontextmanager.policyAdmin
  • Organization Policy Admin: roles/orgpolicy.policyAdmin
  • Security Admin: roles/iam.securityAdmin
  • Service Usage Consumer: roles/serviceusage.serviceUsageConsumer

Restricted Shared VPC Project (created in blueprint foundation)

  • Network Admin: compute.networkAdmin

Analytics Project

  • Service Account Creator: roles/iam.serviceAccountCreator
  • Cloud KMS Admin: roles/cloudkms.admin
  • Compute Instance Admin: roles/compute.admin
  • BigQuery Job User: roles/bigquery.jobUser
  • BigQuery User: roles/bigquery.user
  • Notebooks Runner: roles/notebooks.runner
  • Service Account User: roles/iam.serviceAccountUser
  • Service Usage Admin: roles/serviceusage.serviceUsageAdmin

Data Project

  • BigQuery Job User: roles/bigquery.jobUser
  • BigQuery User: roles/bigquery.user
  • Role Administrator: roles/iam.roleAdmin
  • Storage Admin: roles/storage.admin

KMS Project

  • Cloud KMS Admin: roles/cloudkms.admin

The Project Factory module and the IAM module may be used in combination to provision a service account with the necessary roles applied.

Enable APIs

In order to operate with the Service Account you must activate the following APIs on the project where analytics and Notebooks reside:

  • Access Context Manager API: accesscontextmanager.googleapis.com
  • BigQuery API: bigquery.googleapis.com
  • Compute Engine API: compute.googleapis.com
  • Identity and Access Management (IAM) API: iam.googleapis.com
  • Key Management Service (KMS) API: cloudkms.googleapis.com
  • Notebooks (AI Platform) API: notebooks.googleapis.com
  • Google Cloud Storage API: storage.googleapis.com
  • Resource Manager API: cloudresourcemanager.googleapis.com
  • IAM Service Account Credentials API: iamcredentials.googleapis.com

In order to operate with the Service Account you must activate the following APIs on the project where your KMS/HSM keys reside:

  • Google Cloud Storage API: storage.googleapis.com
  • Key Management Service (KMS) API: cloudkms.googleapis.com

Resource Hierarchy

Within your Org's prod environment, create a folder to hold your trusted projects and centrally managed your policies for Notebooks that use PII data. Note: the fldr-prod is created by the foundation blueprint. Create folders by using the project factory

fldr-prod
└── fldr-trusted
    ├── trusted-data
    ├── trusted-analytics
    └── trusted-kms

Contributing

Refer to the contribution guidelines for information on contributing to this module.

notebooks-blueprint-security's People

Contributors

erlanderlo avatar m-mayran avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.