GithubHelp home page GithubHelp logo

databrickslabs / databricks-sync Goto Github PK

View Code? Open in Web Editor NEW
45.0 45.0 12.0 4.07 MB

An experimental tool to synchronize source Databricks deployment with a target Databricks deployment.

License: Other

Dockerfile 0.35% Makefile 0.10% Python 94.03% HCL 3.00% Shell 1.41% Smarty 0.98% Jinja 0.14%

databricks-sync's People

Contributors

bazzazzadeh avatar dependabot[bot] avatar dillon-bostwick avatar fartzy avatar itaiw avatar nfx avatar oliverjg avatar pohlposition avatar r7l208 avatar stikkireddy avatar tomarv2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

databricks-sync's Issues

Sync Import failing on User-Scim

We are running databricks-sync import against workspace export files generated by dbx-sync. Regardless of whether IDENTITY has been succesfully imported or not (or if Identity is called in conjunction w/ pools/policies or seperately), when either INSTANCE_POOL or CLUSTER_POLICY are imported, we run into an error regarding a reference to undeclared ‘databricks_user.databricks_scim_users’ stemming from the permissions.tf.json files of each pool(line 6)/policy(line 7) imported. Message: “A managed resource “databricks_user” “databricks_scim_users” has not been declared in the root module.”

YAML config should be capable of filtering clusters and jobs by specific criteria

Rather than sync-ing everything, let's make it possible to list the required objects by name, ID, and/or association.

My immediate requirement is to migrate all jobs associated with a particular named, interactive cluster, for example. These will also depend on other objects, i.e. users/groups and cluster policies, but that may be a taller order. Let's get something like this working for now at a minimum:

jobs:
  by:
    existing_cluster_ids:
      - "id...."
      - "id...."
cluster:
  by:
    name:
      - "...."
    id:
      - "id...."
      - "id...."

Taken from this Slack thread w/ @stikkireddy.

unable to run import or export properly due to terraform --version error

I have following version installed

Teraform 0.14.9 version
tfenv 2.2.0

2021-03-29 23:30:12 [INFO] command: terraform --version 2021-03-29 23:30:13 [ERROR] cat: /root/.tfenv/version: No such file or directory 2021-03-29 23:30:13 [ERROR] Version could not be resolved (set by /root/.tfenv/version or tfenv use <version>) Traceback (most recent call last): File "/usr/local/bin/databricks-sync", line 8, in <module> sys.exit(cli())

actual path to the terraform version: /root/.tfenv/versions/0.14.9

adjust instance pool schema

These errors occur during validation after export-ing an instance pool:

2021-06-18 16:28:10 [INFO] ╷
2021-06-18 16:28:10 [INFO] │ Error: Extraneous JSON object property
2021-06-18 16:28:10 [INFO] │ 
2021-06-18 16:28:10 [INFO] │   on databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.tf.json line 19, in resource.databricks_instance_pool.databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.dynamic[1].disk_spec.content:
2021-06-18 16:28:10 [INFO] │   19:                                 "azure_disk_volume_type": null,
2021-06-18 16:28:10 [INFO] │ 
2021-06-18 16:28:10 [INFO] │ No argument or block type is named "azure_disk_volume_type".
2021-06-18 16:28:10 [INFO] ╵
2021-06-18 16:28:10 [INFO] ╷
2021-06-18 16:28:10 [INFO] │ Error: Extraneous JSON object property
2021-06-18 16:28:10 [INFO] │ 
2021-06-18 16:28:10 [INFO] │   on databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.tf.json line 22, in resource.databricks_instance_pool.databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.dynamic[1].disk_spec.content:
2021-06-18 16:28:10 [INFO] │   22:                                 "ebs_volume_type": "${var.GENERAL_PURPOSE_SSD}"
2021-06-18 16:28:10 [INFO] │ 
2021-06-18 16:28:10 [INFO] │ No argument or block type is named "ebs_volume_type".
2021-06-18 16:28:10 [INFO] ╵

The following patch on the generated output resolves the error:

diff --git a/exports/instance_pool/databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.tf.json b/exports/instance_pool/databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.tf.json
index a9af535cc..6a97d219d 100644
--- a/exports/instance_pool/databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.tf.json
+++ b/exports/instance_pool/databricks_instance_pool_0708_074059_tube1_pool_1qXgzzuQ.tf.json
@@ -16,10 +16,12 @@
                     {
                         "disk_spec": {
                             "content": {
-                                "azure_disk_volume_type": null,
                                 "disk_count": 3,
                                 "disk_size": 100,
-                                "ebs_volume_type": "${var.GENERAL_PURPOSE_SSD}"
+				"disk_type": {
+				    "azure_disk_volume_type": null,
+                                    "ebs_volume_type": "${var.GENERAL_PURPOSE_SSD}"
+				}
                             },
                             "for_each": "${[1]}"
                         }

If the folder is empty for dbfs files it fails to skip the object

error example:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 568, in run_until_complete
return future.result()
File "/Users/itaiweiss/PycharmProjects/demoMLFlow/venv/lib/python3.7/site-packages/databricks_sync/sdk/pipeline.py", line 87, in trigger
async for item in self.generate():
File "/Users/itaiweiss/PycharmProjects/demoMLFlow/venv/lib/python3.7/site-packages/databricks_sync/sdk/pipeline.py", line 91, in generate
async for item in self._generate():
File "/Users/itaiweiss/PycharmProjects/demoMLFlow/venv/lib/python3.7/site-packages/databricks_sync/sdk/generators/dbfs.py", line 129, in _generate
for_each_var_id_name_pairs=dbfs_files_id_name_pairs)
File "/Users/itaiweiss/PycharmProjects/demoMLFlow/venv/lib/python3.7/site-packages/databricks_sync/sdk/generators/dbfs.py", line 112, in __make_dbfs_file_data
dbfs_data.upsert_local_variable(self.DBFS_FOREACH_VAR, dbfs_file_data)
AttributeError: 'NoneType' object has no attribute 'upsert_local_variable'

import command w/o apply should not shutdown the clusters

A user can run a dry run import (only --plan w/o --apply).
In this case, we should not shut down the clusters on the target workspace.

potentially add:
Flag to control this behavior
Shutdown only the new clusters we imported and only if we imported clusters

Error: "name 'datetime' is not defined" thrown in terraform.py

This python file has no import python statement.

https://github.com/databrickslabs/databricks-sync/blob/master/databricks_sync/sdk/terraform.py#L157

As a result, this exception was thrown.

Traceback (most recent call last):
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/bin/databricks-sync", line 11, in <module>
    load_entry_point('databricks-sync==1.0.0', 'console_scripts', 'databricks-sync')()
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/click-8.0.1-py3.7.egg/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/click-8.0.1-py3.7.egg/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/click-8.0.1-py3.7.egg/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/click-8.0.1-py3.7.egg/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/click-8.0.1-py3.7.egg/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_cli-0.11.0-py3.7.egg/databricks_cli/configure/config.py", line 55, in decorator
    return function(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/cmds/config.py", line 178, in modify_user_agent
    return function(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/cmds/config.py", line 161, in decorator
    return function(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/click-8.0.1-py3.7.egg/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/cmds/apply.py", line 53, in import_cli
    te.execute()
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/sdk/sync/import_.py", line 28, in wrapper
    resp = func(self_, stage_path=stage_path, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/sdk/sync/import_.py", line 55, in wrapper
    resp = func(self_, repo_path=repo_path, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/sdk/sync/import_.py", line 167, in execute
    state_file_abs_path=state_loc)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/sdk/terraform.py", line 166, in apply
    backup_path = self.__get_backup_path(state_file_abs_path)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-2c7138dd-643a-42c3-8ef2-cccf54ea51fc/lib/python3.7/site-packages/databricks_sync-1.0.0-py3.7.egg/databricks_sync/sdk/terraform.py", line 157, in __get_backup_path
    now_str = datetime.datetime.utcnow().strftime("%Y_%m_%d_%H_%M_%S_%f")
NameError: name 'datetime' is not defined

More detailed yaml validation file

Issue when running export and the yaml file is not thoroughly validated if there are fields that are not intended to be there it passes if the yaml file is valid

[Docs]: CICD example

@itaiw @stikkireddy : I noticed under Quickstart -> Next Steps it says CICD. Is there some reference example or how can we use it in CICD. If I can get some pointers I am willing to do a PR of a working CICD example.

Import Not Working

Hello,

I keep receiving the following error when trying to use import:

INFO] ╷
2022-02-22 11:15:05 [INFO] │ Error: Invalid index
2022-02-22 11:15:05 [INFO] │ 
2022-02-22 11:15:05 [INFO] │   on databricks_group_admins_members.tf.json line 6, in locals.databricks_group_members_databricks_group_admins_members_for_each_var.service-principal-0e409824-d286-49ff-aaf5-5eb65156a5ab:
2022-02-22 11:15:05 [INFO] │    6:                 "member_id": "${\"0e409824-d286-49ff-aaf5-5eb65156a5ab\" == var.ME_USERNAME ? \"something temp will be skipped\" : databricks_user.databricks_scim_users[\"0e409824-d286-49ff-aaf5-5eb65156a5ab\"].id}"
2022-02-22 11:15:05 [INFO] │     ├────────────────
2022-02-22 11:15:05 [INFO] │     │ databricks_user.databricks_scim_users is object with 6 attributes
2022-02-22 11:15:05 [INFO] │ 
2022-02-22 11:15:05 [INFO] │ The given key does not identify an element in this collection value.

Here are the error when i just try to import policies only:

╷
2022-02-22 11:16:51 [INFO] │ Error: Reference to undeclared resource
2022-02-22 11:16:51 [INFO] │ 
2022-02-22 11:16:51 [INFO] │   on databricks_cluster_policy_C96203490C00012A_permissions.tf.json line 8, in resource.databricks_permissions.databricks_cluster_policy_C96203490C00012A_permissions.depends_on:
2022-02-22 11:16:51 [INFO] │    8:                     "databricks_user.databricks_scim_users",
2022-02-22 11:16:51 [INFO] │ 
2022-02-22 11:16:51 [INFO] │ A managed resource "databricks_user" "databricks_scim_users" has not been
2022-02-22 11:16:51 [INFO] │ declared in the root module.

databricks-sync import is failing consistently

Databricks sync has been failing consistently...with "value out of range error"....triggered from Jenkins. Any help here, please?

databricks-sync -v debug import --profile targetProfile --artifact-dir /home/jenkins/workspace/sandbytes-processing-dbx-workspace-syncup-prod@2/backend/ --backend-file /home/jenkins/workspace/sandbytes-processing-dbx-workspace-syncup-prod@2/backend/backendfile.json -l /home/jenkins/workspace/sandbytes-processing-dbx-workspace-syncup-prod@2/git-repo/ --plan --apply

2022-03-29 12:52:54 [INFO] ===USING LOCAL GIT DIRECTORY: /home/jenkins/workspace/sandbytes-processing-dbx-workspace-syncup-prod@2/git-repo/===
2022-03-29 12:52:54 [INFO] USING HOST: https://adb-xxxxxx.xx.azuredatabricks.net
2022-03-29 12:52:54 [INFO] Setting debug flags on.
2022-03-29 12:52:55 [INFO] Main TF File: {
.....
......
.....
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_dbfs_file.databricks_dbfs_files["/dbfs/certs/xxxxcert.zip"]: Refreshing state... [id=/dbfs/certs/xxxxx.zip][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_instance_pool.databricks_instance_pool_0504_215013_boots354_pool_xxx: Refreshing state... [id=0324-045724-blend21-pool-xxxxx][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_global_init_script.databricks_global_init_scripts["87A27C5F0BFF6A49"]: Refreshing state... [id=E6A6BAB2624D1984][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_global_init_script.databricks_global_init_scripts["B991E778CE439611"]: Refreshing state... [id=B0FBF63CD0D3446A][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_global_init_script.databricks_global_init_scripts["CFCAEB86A3CCC17F"]: Refreshing state... [id=750B7EA7A9458674][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_global_init_script.databricks_global_init_scripts["E8E159F5874B5B1F"]: Refreshing state... [id=9E69055E76148828][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_permissions.databricks_instance_pool_0504_213753_gamey331_pool_xxxxxx_permissions: Refreshing state... [id=/instance-pools/0324-045724-wader20-pool-xxxxx][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_permissions.databricks_instance_pool_0504_214408_mufti352_pool_xxxxx_permissions: Refreshing state... [id=/instance-pools/0324-045724-duck19-pool-xxxx][0m
2022-03-29 12:53:00 [INFO] [0m[1mdatabricks_permissions.databricks_instance_pool_0504_214902_bless353_pool_xxxx_permissions: Refreshing state... [id=/instance-pools/0324-045724-clear22-pool-xxxxx][0m
2022-03-29 12:53:01 [INFO] [0m[1mdatabricks_permissions.databricks_instance_pool_0504_215013_boots354_pool_xxxxx_permissions: Refreshing state... [id=/instance-pools/0324-045724-blend21-pool-xxxx][0m
2022-03-29 12:53:01 [INFO] [0m[1mdatabricks_cluster.databricks_cluster_1124_081110_xxxxx: Refreshing state... [id=0324-045728-xxxxxx][0m
2022-03-29 12:53:01 [INFO] [0m[1mdatabricks_cluster.databricks_cluster_0514_130401_cxxxx: Refreshing state... [id=0324-045728-xxxxx][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_job.databricks_job_6794856: Refreshing state... [id=504899988646013][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_job.databricks_job_6745600: Refreshing state... [id=630883479491131][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_job.databricks_job_6844849: Refreshing state... [id=337966858036044][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_job.databricks_job_6718960: Refreshing state... [id=1100824357679556][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_job.databricks_job_6745998: Refreshing state... [id=405210453678418][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_job.databricks_job_6844157: Refreshing state... [id=722810208591835][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_permissions.databricks_cluster_1124_081110_xxxxx_permissions: Refreshing state... [id=/clusters/0324-045728-xxxxx][0m
2022-03-29 12:53:02 [INFO] [0m[1mdatabricks_permissions.databricks_cluster_0514_130401_cxxxxx_permissions: Refreshing state... [id=/clusters/0324-045728-xxxxxx][0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "504899988646013": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "722810208591835": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "1100824357679556": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "842600655414946": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "337966858036044": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "630883479491131": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "405210453678418": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
2022-03-29 12:53:02 [INFO] [31m
2022-03-29 12:53:02 [INFO] [1m[31mError: [0m[0m[1mstrconv.ParseInt: parsing "1073209794217287": value out of range[0m
2022-03-29 12:53:02 [INFO]
2022-03-29 12:53:02 [INFO] [0m[0m[0m
Traceback (most recent call last):
File "/usr/local/bin/databricks-sync", line 11, in
load_entry_point('databricks-sync==1.0.0', 'console_scripts', 'databricks-sync')()
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 754, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks_cli/configure/config.py", line 55, in decorator
return function(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/cmds/config.py", line 178, in modify_user_agent
return function(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/cmds/config.py", line 161, in decorator
return function(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/cmds/apply.py", line 53, in import_cli
te.execute()
File "/usr/local/lib/python3.7/site-packages/databricks_sync/sdk/sync/import
.py", line 28, in wrapper
resp = func(self
, stage_path=stage_path, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/sdk/sync/import
.py", line 55, in wrapper
resp = func(self_, repo_path=repo_path, **kwargs)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/sdk/sync/import_.py", line 160, in execute
state_file_abs_path=state_loc)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/sdk/terraform.py", line 154, in plan
return self._cmd(plan_cmd)
File "/usr/local/lib/python3.7/site-packages/databricks_sync/sdk/terraform.py", line 104, in _cmd
ret_code, ' '.join(cmds), out=out, err=out)
databricks_sync.sdk.terraform.TerraformCommandError: Command 'terraform plan -lock=false -out /home/jenkins/workspace/sandbytes-processing-dbx-workspace-syncup-prod@2/backend/plan.out -input=false' returned non-zero exit status 1.
[Pipeline] }
[Pipeline] // script
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code 1
Finished: FAILURE

Migrate Secret Scopes only for AKV scenarios

Create a filter and an optional field in the yaml for this.

Along the liens of:

"only_akv_scopes"

Scope mode: should we pull scope and acls or scope only

Currently requires changes that are pending on the azure databricks product to support service principals or databricks pat tokens to support attaching AKV to Databricks workspace

Test DR mapping for instance pools, jobs, and clusters

  • Create a new test in an integration test
  • Export Pools, Clusters, and Jobs with DR switch
  • Import the objects twice, one with var.PASSIVE_MODE == false and one with var.PASSIVE_MODE == true
    Compare the results:
  1. Verify that pools have variable min_idle
  2. Verify that the jobs have a variable for schedule
  3. Verify that Clusters have var for node and driver types

Issue with files ending with _override.tf.json

files ending with override will cause weird terraform behavior: https://www.terraform.io/docs/language/files/override.html

Make sure that files do not end with _override.tf.json

2021-05-25 11:26:59 [INFO] command: terraform validate
2021-05-25 11:27:05 [INFO] ╷
2021-05-25 11:27:05 [INFO] │ Error: Missing resource to override
2021-05-25 11:27:05 [INFO] │
2021-05-25 11:27:05 [INFO] │ on 1_override.tf.json line 4, in resource.databricks_notebook:
2021-05-25 11:27:05 [INFO] │ 4: "_override": {
2021-05-25 11:27:05 [INFO] │
2021-05-25 11:27:05 [INFO] │ There is no databricks_notebook resource named
2021-05-25 11:27:05 [INFO] │ "_override".
2021-05-25 11:27:05 [INFO] │ An override file can only override a resource block defined in a primary
2021-05-25 11:27:05 [INFO] │ configuration file.

Add important validations before running job

add validation step to check

Token is valid
DBFS path exists
Notebook path exists
For any other resource list the objects and if empty log.warn a warning
check if the user is admin:
If identity is listed then log.error
else log.warn

return fail if there are any log.error otherwise pass

Add process logging

  • reading config file --Info
  • checking out from git --Info
  • testing connection --Info
  • Stat object fetch --Info
  • fetching objects --Debug
  • Stat object process --Info
  • process objects --Debug
  • Stat object write --Info
  • writing the file --Debug
  • commit to git --Info
  • print git diff --Info

Error exporting identity

Hello,

I receive the following error when exporting identity:

Traceback (most recent call last):
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/sync/export.py", line 74, in export
    exp.run()
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 595, in run
    self.__generate_all()
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 584, in __generate_all
    loop.run_until_complete(groups)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 87, in trigger
    async for item in self.generate():
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 92, in generate
    async for item in self._generate():
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/generators/identity.py", line 477, in _generate
    service_principals_data[id_] = self.get_service_principal_dict(service_principal)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/generators/identity.py", line 412, in get_service_principal_dict
    ServicePrincipalSchema.DISPLAY_NAME: sp["displayName"],
KeyError: 'displayName'

If i remove comment out line 412 in identity.py the export will succeed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.