cloudera-labs / cloudera-deploy Goto Github PK

View Code? Open in Web Editor NEW

62.0 6.0 58.0 166 KB

A general purpose framework for automating Cloudera Products

License: Apache License 2.0

HCL 80.52% Jinja 19.48%

ansible cdp cdp-public-cloud cdp-private-cloud

cloudera-deploy's People

Contributors

Stargazers

Watchers

cloudera-deploy's Issues

[Feature Req] Tool/Script to generate "cluster.yml" configs from existing exported CDP cluster template json files

Not sure I'm missing a simpler strategy, on howto prepare a new "cluster.yml" file required by this playbook (to be configured in the definition_path), from an exportable CDP (7.1.x / private-cloud) cluster ?
Would be great to get input from Cloudera guys :)
Alternatively, or in addition, it would be highly useful/helpful if much more advanced "cluster.yml" example would be added to the repo (for ex. to deploy a 3-master node HA cluster, as this was nicely done in the former HDP repo: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/group_vars/example-hdp-ha-3-masters-with-ranger-atlas )

Once I learn here in the community, there's indeed no such existing script .. I'm happy to write&contribute myself something

Minimal features:

create the mapping of the contained services to the host-groups
create the mapping of all the found "configs" elements to the key/value pairs in the "cluster.yml" service's "dict" element
- (later) nice to have: an option to skip any config values which are/were just defaults from the beginning
many other things .. that are required to make it useable/work?
handling these "refName" values in the sourc template json

Script Input / Output examples

Input file, just small extract (from exported cluster template)
- Can give more infos later howto that for people not familiar.

{
  "cdhVersion" : "7.1.4",
  "displayName" : "Basic Cluster",
  "cmVersion" : "7.1.4",
  "repositories" : [ ... ],
  "products" : [ {
    "version" : "7.1.4-1.cdh7.1.4.p0.6300266",
    "product" : "CDH"
  } ],
  "services" : [ {
    "refName" : "zookeeper",
    "serviceType" : "ZOOKEEPER",
    "serviceConfigs" : [ {
      "name" : "zookeeper_datadir_autocreate",
      "value" : "true"
    } ],
    "roleConfigGroups" : [ {
      "refName" : "zookeeper-SERVER-BASE",
      "roleType" : "SERVER",
      "configs" : [ {
        "name" : "zk_server_log_dir",
        "value" : "/var/log/zookeeper"
      }, {
        "name" : "dataDir",
        "variable" : "zookeeper-SERVER-BASE-dataDir"
      }, {
        "name" : "dataLogDir",
        "variable" : "zookeeper-SERVER-BASE-dataLogDir"
      } ],
      "base" : true
    } ]
  }, 
...

  } ],
  "hostTemplates" : [ {
    "refName" : "HostTemplate-0-from-eval-cdp-public[1-3].internal.cloudapp.net",
    "cardinality" : 3,
    "roleConfigGroupsRefNames" : [ "hdfs-DATANODE-BASE", "spark_on_yarn-GATEWAY-BASE", "yarn-NODEMANAGER-BASE" ]
  }, {
    "refName" : "HostTemplate-1-from-eval-cdp-public0.internal.cloudapp.net",
    "cardinality" : 1,
    "roleConfigGroupsRefNames" : [ "hdfs-NAMENODE-BASE", "hdfs-SECONDARYNAMENODE-BASE", "spark_on_yarn-GATEWAY-BASE", "spark_on_yarn-SPARK_YARN_HISTORY_SERVER-BASE", "yarn-JOBHISTORY-BASE", "yarn-RESOURCEMANAGER-BASE", "zookeeper-SERVER-BASE" ]
  } ],
...

Output file, following the format of cluster.yml, for ex: roles/cloudera_deploy/defaults/basic_cluster.yml

clusters:
  - name: Basic Cluster
    services: [HDFS, YARN, ZOOKEEPER]
    repositories:
      - https://archive.cloudera.com/cdh7/7.1.4.0/parcels/
    configs:
      ZOOKEEPER:
        SERVICEWIDE:
          zookeeper_datadir_autocreate: true
          zk_server_log_dir": "/var/log/zookeeper"
     
      HDFS:
        DATANODE:
          dfs_data_dir_list: /dfs/dn
        NAMENODE:
          dfs_name_dir_list: /dfs/nn
...
    host_templates:
      Master1:
        HDFS: [NAMENODE, SECONDARYNAMENODE]
        YARN: [RESOURCEMANAGER, JOBHISTORY]
        ZOOKEEPER: [SERVER]
      Workers:
        HDFS: [DATANODE]
        YARN: [NODEMANAGER]

Log Levels :

Feature Request

It would be nice to have different log levels and the logging to be more verbose by default vs whats output when the ansible scripts actually run.
This would make it easier to check if a specific step is executing or has stopped. , especially important when during creation of VM's that take longer to finish.

what's the password of kerberos users?

it's a great tool,
but where can I find the password of kerberos users if kerberos=true in definition.yml?
thanks in advance

tag "full_cluster" missing on task "Install Cloudera Manager agents"

the task "Install Cloudera Manager agents" located on cluster.yml is missing the tag "full_cluster". Not sure if this was made intentionally or not, but as I get used to use the tag "full_cluster" to deploy on premisse, now my scripts are failing.

Running playbook in DEBUG fails on localhost ssh connection

I tried to run the playbook in DEBUG mode and found that a local ssh connection failed:
[email protected]: Permission denied (publickey,password)

Turned out that in cloudera-deploy ~/.ssh is mounted under /home/runner/.ssh and HOME is set to be /home/runner… but in debug mode some bits still depend on /root/.ssh/.
Copying content from /home/runner/.ssh/ to /root/.ssh/ solved these issues.

TASK [cloudera.cluster.repometa : Download parcel manifest information url={{ repository | regex_replace('/?$','') + '/manifest.json' }}, status_code=200, body_format=json, retu
rn_content=True, url_username={{ parcel_repo_username | default(omit) }}, url_password={{ parcel_repo_password | default(omit) }}] ***
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/deployment/repometa/tasks/parcels.yml:17
Monday 31 May 2021  07:32:09 +0000 (0:00:00.111)       0:00:24.509 ************
 11741 1622446329.37654: sending task start callback
 11741 1622446329.37659: entering _queue_task() for localhost/uri
...
<127.0.0.1> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbas
ed,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o ControlPath=/home/runner/.ansible/cp/21f0e6a9ae 127.0.0.1 '/bin/sh -c '"'"'echo ~root && sleep 0'"'"''
 12412 1622446329.49261: stderr chunk (state=2):
>>>OpenSSH_8.0p1, OpenSSL 1.1.1g FIPS  21 Apr 2020
...
 12412 1622446345.11344: stderr chunk (state=3):
>>>debug3: authmethod_lookup publickey
debug3: remaining preferred: ,gssapi-keyex,hostbased,publickey
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Trying private key: /root/.ssh/id_rsa
debug3: no such identity: /root/.ssh/id_rsa: No such file or directory
...
[email protected]: Permission denied (publickey,password).```

"unknown flag: --mount" in quickstart.sh

As the pre-req to run cloudera-deploy, I have installed the default docker version on CentOS 7.9 as follows:
yum install docker

That's the version that was installed:

# docker --version
Docker version 1.13.1, build 7d71120/1.13.1

Now, while trying to run quickstart.sh - I got the following error:

./quickstart.sh
Checking if Docker is running...
Docker OK
Trying to pull repository ghcr.io/cloudera-labs/cldr-runner ...
full-latest: Pulling from ghcr.io/cloudera-labs/cldr-runner
Digest: sha256:01504a335c7fe1c29ba695ca996b464be28925a545f7f2b5bb1c1624e145e208
Status: Image is up to date for ghcr.io/cloudera-labs/cldr-runner:full-latest
Ensuring default credential paths are available in calling using profile for mounting to execution environment
Ensure Default profile is present
Custom Cloudera Collection path not found
Mounting /home/ansible to container as Project Directory /runner/project
Creating Container cloudera-deploy from image ghcr.io/cloudera-labs/cldr-runner:full-latest
Checking OS
SSH authentication for container taken from /tmp/ssh-bAkDNAJcUgfV/agent.9389
Creating new execution container named 'cloudera-deploy'
unknown flag: --mount
See 'docker run --help'.

The above version does not recognise "--mount" flag.
It looks like "--mount" was introduced only in Docker 17.05
https://docs.docker.com/engine/release-notes/17.05/#client

SSH_AUTH_SOCK not set on Windows

When running the quickstart.sh script on an Ubuntu WSL session on Windows, the SSH_AUTH_SOCK variable isn't set as the ssh-agent isn't started by default.

$ ./quickstart.sh
Checking if Docker is running...
Docker OK
full-latest: Pulling from cloudera-labs/cldr-runner
Digest: sha256:15442500076f42918fd82f5f94cf0aaf4564aa235bd66b47edb2ec052e099e59
Status: Image is up to date for ghcr.io/cloudera-labs/cldr-runner:full-latest
ghcr.io/cloudera-labs/cldr-runner:full-latest
Ensuring default credential paths are available in calling using profile for mounting to execution environment
Ensure Default profile is present
Custom Cloudera Collection path not found
Mounting /mnt/c/Users/jeff/tmp to container as Project Directory /runner/project
Creating Container cloudera-deploy from image ghcr.io/cloudera-labs/cldr-runner:full-latest
Checking OS
SSH_AUTH_SOCK is empty or not set, unable to proceed. Exiting

One possible fix would be to add a line to the script to check and start the ssh-agent. But running ssh-agent directly doesn't set the SSH_AUTH_SOCK variable as it needs to be wrapped in an eval.

Would it be possible to add something like the following to the quickstart.sh script?

if pgrep -x "ssh-agent" >/dev/null
then
    echo "ssh-agent is running"
else
    echo "ssh-agent stopped"
    eval `ssh-agent -s` 
fi

I tried adding it to the start of the quickstart.sh script and it worked fine.

Update migration documentation to include details on using legacy tags

How to use tags like infra with the v2 (i.e. the cloudera.exe playbooks) -- the legacy -t run is no longer valid.

Reduce User Interaction :

The Ansible script creates and connects to newly formed EC2 instances and while connecting to each VM it asks the trust question.
We should have a flag to automate this. A standard -y to continue un-interrupted would allow the creation and deployment without constant user interaction.

Broken Link

The document link in the cloudera deploy instruction page, in the location:

If you do not have CDP user click here , produces a broken link : This is the link used

CDP private teardown - asks for a credentials

I was able to deploy the cdp private without any credentials, only cdp license has been used.
I am trying to teardown our deployed cluster via tags -t teardown,all . however it fails with this missing credetials error.

TASK [cloudera.exe.runtime : Refresh Environment Info with Descendants] ****************************************************************************************************
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/exe/roles/runtime/tasks/initialize_teardown.yml:17
Friday 11 November 2022  13:39:06 +0000 (0:00:00.069)       0:00:08.557 *******
fatal: [localhost]: FAILED! => {"changed": false, "error": "{'base_error': NoCredentialsError('Unable to locate CDP credentials: No credentials found anywhere in chain. The shared credentials file should be stored at /home/runner/.cdp/credentials.'), 'ext_traceback': ['  File \"/root/.ansible/tmp/ansible-tmp-1668173946.776787-24441-170028905131803/AnsiballZ_env_info.py\", line 102, in <module>\\n    _ansiballz_main()\\n', '  File \"/root/.ansible/tmp/ansible-tmp-1668173946.776787-24441-170028905131803/AnsiballZ_env_info.py\", line 94, in _ansiballz_main\\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\\n', '  File \"/root/.ansible/tmp/ansible-tmp-1668173946.776787-24441-170028905131803/AnsiballZ_env_info.py\", line 40, in invoke_module\\n    runpy.run_module(mod_name=\\'ansible_collections.cloudera.cloud.plugins.modules.env_info\\', init_globals=None, run_name=\\'__main__\\', alter_sys=True)\\n', '  File \"/usr/lib64/python3.8/runpy.py\", line 207, in run_module\\n    return _run_module_code(code, init_globals, run_name, mod_spec)\\n', '  File \"/usr/lib64/python3.8/runpy.py\", line 97, in _run_module_code\\n    _run_code(code, mod_globals, init_globals,\\n', '  File \"/usr/lib64/python3.8/runpy.py\", line 87, in _run_code\\n    exec(code, run_globals)\\n', '  File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 471, in <module>\\n', '  File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 461, in main\\n', '  File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 424, in __init__\\n', '  File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/module_utils/cdp_common.py\", line 42, in _impl\\n    result = f(self, *args, **kwargs)\\n', '  File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 429, in process\\n', '  File \"/usr/local/lib/python3.8/site-packages/cdpy/environments.py\", line 55, in describe_environment\\n    resp = self.sdk.call(\\n', '  File \"/usr/local/lib/python3.8/site-packages/cdpy/common.py\", line 594, in call\\n    parsed_err = CdpError(err)\\n'], 'error_code': None, 'violations': None, 'message': None, 'status_code': None, 'rc': None, 'service': None, 'operation': None, 'request_id': None}", "msg": "None", "violations": null}

quickstart.sh: unintentional command execution?

quickstart.sh includes the following checks:

[...]

if [ -n "${CLDR_PYTHON_PATH}" ]; then
  echo "Path to custom Python sourcecode supplied as ${CLDR_PYTHON_PATH}, setting as System PYTHONPATH"
  PYTHONPATH="${CLDR_PYTHON_PATH}"
else
  echo "'CLDR_PYTHON_PATH' is not set, skipping setup of PYTHONPATH in execution container"
fi

echo "Checking if ssh-agent is running..."
if pgrep -x "ssh-agent" >/dev/null
then
    echo "ssh-agent OK"
else
    echo "ssh-agent is stopped, please start it by running: eval `ssh-agent -s` "
    #eval `ssh-agent -s`
fi

echo "Checking OS"
if [ ! -f "/run/host-services/ssh-auth.sock" ];
then
   if [ -n "${SSH_AUTH_SOCK}" ];
   then
        SSH_AUTH_SOCK=${SSH_AUTH_SOCK}
   else
    echo "ERROR: SSH_AUTH_SOCK is empty or not set, unable to proceed. Exiting"
    exit 1
   fi
else
    SSH_AUTH_SOCK=${SSH_AUTH_SOCK}
fi

[...]

The first check looks for an env var being set, and expliticly skips some setup if that env var is not found, i.e. that setting is explicitly optional. Fine.

The third check looks for an env var being set, and explicitly exits with an error if it's not present. Also fine.

The second check is to see if ssh-agent is running. If it is not running, the code looks like it intends to print a helpful error message for the user, to empower the user to manually do something before coming back to try this script again. There are a couple of issues here:

If ssh-agent is required, shouldn't the else block here explicitly exit 1, like the next check for $SSH_AUTH_SOCK does? (Otherwise, the intention would be clearer if it stated explicitly that it's carrying on regardless, like the first check does, but I don't think that's the idea here).
(main issue) Because the backticks aren't escaped, printing the error message actually runs the ssh-agent -s command. Pretty sure that's not what's intended, since in that case the actual message to the user doesn't make sense. Also, judging by the fact the next line is a commented-out version of the same command, seems it was considered to do automatically, but chose not to. Again, this is where I think an explicit exit 1 should be here instead.

If my read is right, happy to submit a PR?

Centos7-init fails in various ways in some circumstance

cloudera-deploy/centos7-init.sh

Line 26 in a2737e4

pip3 install ansible

Fails when Ansible is already installed on the system as a package
Should also be pinned to >=2.10.0,<=2.11

cloudera-deploy/centos7-init.sh

Line 57 in a2737e4

tee -a ansible.cfg << EOF

if the script is run multiple times, this will be continuously concatenated to the file

cloudera-deploy/centos7-init.sh

Line 59 in a2737e4

inventory=inventory

Inventory dir is not created by this script and needs to be for this config and the auto-inventory to work
https://github.com/cloudera-labs/cldr-runner/tree/main/payload/inventory is required

Facts distribution does not work with inline vaulted variables

cloudera-deploy/roles/cloudera_deploy/tasks/distribute_facts_to_inventory.yml

Line 43 in a14eefd

file: "{{ init__user_definition_file }}"

Error message:
The full traceback is: Traceback (most recent call last): File "/home/centos/.local/lib/python3.6/site-packages/ansible/executor/task_executor.py", line 585, in _execute self._task.post_validate(templar=templar) File "/home/centos/.local/lib/python3.6/site-packages/ansible/playbook/task.py", line 307, in post_validate super(Task, self).post_validate(templar) File "/home/centos/.local/lib/python3.6/site-packages/ansible/playbook/base.py", line 431, in post_validate value = templar.template(getattr(self, name)) File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 840, in template disable_lookups=disable_lookups, File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 795, in template disable_lookups=disable_lookups, File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 1057, in do_template res = j2_concat(rf) File "<template>", line 14, in root File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 255, in wrapper ret = func(*args, **kwargs) File "/home/centos/.local/lib/python3.6/site-packages/ansible/plugins/filter/core.py", line 209, in from_yaml return yaml.safe_load(data) File "/home/centos/.local/lib/python3.6/site-packages/yaml/__init__.py", line 162, in safe_load return load(stream, SafeLoader) File "/home/centos/.local/lib/python3.6/site-packages/yaml/__init__.py", line 114, in load return loader.get_single_data() File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 51, in get_single_data return self.construct_document(node) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 60, in construct_document for dummy in generator: File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 413, in construct_yaml_map value = self.construct_mapping(node) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 218, in construct_mapping return super().construct_mapping(node, deep=deep) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 143, in construct_mapping value = self.construct_object(value_node, deep=deep) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 100, in construct_object data = constructor(self, node) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 429, in construct_undefined node.start_mark) yaml.constructor.ConstructorError: could not determine a constructor for the tag '!vault' in "<unicode string>", line 395, column 26: krb5_kdc_admin_password: !vault |

Deployment fails with dynamic inventory and absolute definition paths

There appears to be a bug where, if you are using dynamic inventory and an absolute path to your definition, it fails to use the default inventory path.
e.g. ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=/opt/cloudera-deploy/examples/c5secure" -t infra,full_cluster

Workaround is to pass in the correct inventory path with -i /runner/inventory
e.g.
ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=/opt/cloudera-deploy/examples/c5secure" -i /runner/inventory -t infra,full_cluster

Print Deployed Nodes

Print all Cloudera Manager nodes deployed to the log at the end of the run.

Improve definition file loading

Context :

I began to have a huge unreadable definition file, so i wanted to use ansible variables.
I discovered that the definition file was loaded as a simple file, parsed as-is, in yaml. That means variables won't be interpreted (i.e. host_templates: "{{ cdp_host_templates }}" will be interpred as string : "{{ cdp_host_templates }}")

My definition file looks like this :

clusters:

- name: "{{ cdp_cluster_name }}"
  type: base
  services: "{{ cdp_services }}"
  databases: "{{ cdp_databases }}"
  configs: "{{ fresh_install_configs }}"
  host_templates: "{{ cdp_host_templates }}"
...

Issue :

Using variables in the defintion file raises an error due to a check done on the host_templates here

The error message is : Unable to host template {{ host_template }} in the cluster definition

This check is trying to find a host template named 'xxx' in : host_templates: "{{ host_templates }}" and fails because it is interpreted as a string...
The tasks reponsible for this is here

Solution :

Use include_vars instead of lookup(file...)

ec2 instance types check fails if aws cli not set to json

The call which checks on which ec2 instance types are available relies on AWS CLI as there isn't an Ansible collection call for it, the task which parses the output assumes it will be json and it fails if the user has set a non-json output in their AWS profile.
Task failing is cloudera.exe.infrastructure: Check required AWS EC2 Instance Types

runlevel doc link is broken

Run level doc link in the last line of "Tags" section is broken.

Cannot change configuration parameter on deployed cluster

after deployment of the CDP private cluster and change of definition.yml for example hdfs failed volumes tolerated, like:

configs:
      HDFS:
        DATANODE:
          dfs_datanode_failed_volumes_tolerated: 3

when running the playbook again, no parameters are changed.

Can't detect the required Python library cryptography (>= 1.2.3)

While testing the deployment in the internal (cloudcat) and external environment - ran into the following issue:

TASK [cloudera.cluster.ca_server : Generate root private key] **************************************************************************************** Friday 28 January 2022 00:32:45 +0000 (0:00:00.854) 0:08:16.774 ******** fatal: [cla-tt-2a-mas1.clatest.telstraglobal.net]: FAILED! => {"changed": false, "msg": "Can't detect the required Python library cryptography (>= 1.2.3)"}

Check inside of the Docker:

cldr full-v1.5.3 #> pip show cryptography
Name: cryptography
Version: 3.3.2
Summary: cryptography is a package which provides cryptographic recipes and primitives to Python developers.
Home-page: https://github.com/pyca/cryptography
Author: The cryptography developers
Author-email: [email protected]
License: BSD or Apache License, Version 2.0
Location: /usr/local/lib64/python3.8/site-packages
Requires: cffi, six
Required-by: adal, ansible-base, azure-cli-core, azure-identity, azure-keyvault, azure-storage, msal, openstacksdk, paramiko, pyOpenSSL, pypsrp, pyspnego, requests-credssp, requests-ntlm

Password Requirements : CDP

CDP Password requirements not checked / enforced upfront.

Quickstart depends on cdp cli for creating a CDP environment and requires a specific password standard . This is not enforced or checked upfront, thereby failing later if an incorrect password is set.

The CDP password requirements must be checked upfront before cluster creation.

Trying to install CDP private cloud, but still need AWS credentials

Tried to install CDP private cloud. I updated the inventory file to include all my hosts, but during the deployment it still tries to connect to AWS and complains it does not have the appropriate credentials:

TASK [cloudera_deploy : Get AWS Account Info] ****************************************************************************************************************
Thursday 20 May 2021 22:45:31 +0000 (0:00:00.036) 0:00:38.058 **********
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: botocore.exceptions.NoCredentialsError: Unable to locate credentials
fatal: [localhost]: FAILED! => {"boto3_version": "1.17.66", "botocore_version": "1.20.66", "changed": false, "msg": "Failed to retrieve caller identity: Unable to locate credentials"}

How do I tell it this is a private cloud installation that has nothing to do with AWS ?

EZMode Documentation :

EZMode Documentation

Require better, more prescriptive steps that can be cut and pasted without having to read it .

Clarify instructions in Credentials section of README

The Credentials section of the README is missing a key piece of information -- you need to be in a project with cldr-runner-enabled ansible-navigator.yml configuration file.

Disable default Ansible Host Key Checking

cloudera-deploy/quickstart.sh

Line 69 in a14eefd

-e ANSIBLE_HOST_KEY_CHECKING=true \

Ansible will pause execution to gain interactive assent to connect to hosts not in the known_hosts file during a typical dynamic inventory execution, as such it is normal to disable host key checking for a smooth default experience

https://docs.ansible.com/ansible/latest/user_guide/connection_details.html

Broken link

The link to CDP CLI in the readme page is broken. This is the text :
Visit the CDP CLI User Guide for further details regarding credential management. The link to user guide is broken

SSH_AUTH_SOCK across OS :

Make the SSH_AUTH_SOCK implementation O.S agnostic

The current implementation of SSH_AUTH_SOCK is O.S specific specifically OSX specific.
The below hard-codes the target path that works only for OSX.

--mount type=bind,src=$SSH_AUTH_SOCK,target=/run/host-services/ssh-auth.sock \

Change this to make it work for all OS's , specifically Linux.

Cannot run ansible-navigator using Python3.8 on osx

tested on mac osx...

ansible-navigator using a Py3.8 env is unable to parse ansible-navigator.yml files as in our public-cloud/aws/ examples

when used with Py3.11 env, it was able to use the ansible-navigator.yml setting file

did not test any other versions of Python

How to set proxy in the definition.yml ?

Hello, I am deploying the CDP private Basic Cluster via definition.yml file.
I have managed to add kerberos via adding the following code as an parameter in the basic cluster

 security:
        kerberos: true

I am looking for a solution of setting proxy parameters in definiton.yml
I think it should look like this:

configs:
    parcel_proxy_port: 1234
    parcel_proxy_server: my_beautiful_proxy.com

But in the configs section of the cluster it expects service name, I need to specify somehow that it's CM i suppose..

I have tried to add it into mgmt as well, it does not recognized those options.

keytool error: java.lang.Exception: Public keys in reply and keystore don't match

Trying to deploy CDP private cluster with kerberos, ranger and autotls.

playbook execution command:

ansible-playbook /runner/project/cloudera-deploy/main.yml -e "definition_path=/runner/project/cloudera-deploy/examples/sandbox" -e "profile=/home/runner/.config/cloudera-deploy/profiles/default" -t default_cluster,kerberos,tls  -i "/runner/project/cloudera-deploy/examples/sandbox/inventory_static.ini" --flush-cache

After execution, playbook fails on the task:

TASK [cloudera.cluster.tls_install_certs : Install signed certificate reply into keystore] ***
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/security/tls_install_certs/tasks/main.yml:126

with error below (on each node)

[ "cmd": "/usr/bin/keytool -importcert -alias \"node1.domain.com\" -file \"/opt/cloudera/security/pki/node1.domain.com.pem\" -keystore \"/opt/cloudera/security/pki/node1.domain.com.jks\" -storepass \"changeme\" -trustcacerts -noprompt](fatal: [node1.domain.com]: FAILED! => {"changed": false, "cmd": "/usr/bin/keytool -importcert -alias \"node1.domain.com\" -file \"/opt/cloudera/security/pki/node1.domain.com.pem\" -keystore \"/opt/cloudera/security/pki/node1.domain.com.jks\" -storepass \"changeme\" -trustcacerts -noprompt\n", "delta": "0:00:00.247693", "end": "2023-01-09 13:27:30.366003", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2023-01-09 13:27:30.118310", "stderr": "", "stderr_lines": [], "stdout": "keytool error: java.lang.Exception: Public keys in reply and keystore don't match", "stdout_lines": ["keytool error: java.lang.Exception: Public keys in reply and keystore don't match"]})

Any idea why is this happening ?

I have tried to import certs manually via

/usr/bin/keytool -importcert -alias node1.domain.com -file /opt/cloudera/security/pki/node1.domain.com.pem -keystore /opt/cloudera/security/pki/node1.domain.com.jks -trustcacerts -noprompt

And the cert have been added successfully...

Unqualified properties causing execution to fail

It seems that commit 0526f52 introduced some unqualified variables that are causing the execution to fail with the following error:

TASK [cloudera_deploy : Check Supplied terraform_base_dir variable] ************
task path: /runner/project/cloudera-deploy/roles/cloudera_deploy/tasks/init.yml:232

fatal: [localhost]: FAILED! => {
    "msg": "The conditional check 'infra_deployment_engine == 'terraform'' failed. The error was: error while evaluating conditional (infra_deployment_engine == 'terraform'): 'infra_deployment_engine' is undefined\n\nThe error appears to be in '/runner/project/cloudera-deploy/roles/cloudera_deploy/tasks/init.yml': line 232, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Check Supplied terraform_base_dir variable\n  ^ here\n"
}

Unable to create CDP environment

Hi,
I am getting the following error while executing /opt/cldr-runner/collections/ansible_collections/cloudera/cloud/plugins/modules/env.py
from /opt/cldr-runner/collections/ansible_collections/cloudera/exe/roles/platform/tasks/setup_aws_env.yml

below is the error message
*
Monday 25 October 2021 17:08:40 +0000 (0:00:02.728) 0:01:33.504 ********
ok: [localhost] => {
"msg": {
"changed": false,
"exception": "Traceback (most recent call last):\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 102, in \n _ansiballz_main()\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 40, in invoke_module\n runpy.run_module(mod_name='ansible_collections.cloudera.cloud.plugins.modules.env', init_globals=None, run_name='main', alter_sys=True)\n File "/usr/lib64/python3.8/runpy.py", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File "/usr/lib64/python3.8/runpy.py", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File "/usr/lib64/python3.8/runpy.py", line 87, in _run_code\n exec(code, run_globals)\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1055, in \n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1045, in main\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 662, in init\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/module_utils/cdp_common.py", line 42, in _impl\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 687, in process\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 926, in _reconcile_existing_state\nKeyError: 'logStorage'\n",
"failed": true,
"module_stderr": "Traceback (most recent call last):\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 102, in \n _ansiballz_main()\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 40, in invoke_module\n runpy.run_module(mod_name='ansible_collections.cloudera.cloud.plugins.modules.env', init_globals=None, run_name='main', alter_sys=True)\n File "/usr/lib64/python3.8/runpy.py", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File "/usr/lib64/python3.8/runpy.py", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File "/usr/lib64/python3.8/runpy.py", line 87, in _run_code\n exec(code, run_globals)\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1055, in \n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1045, in main\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 662, in init\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/module_utils/cdp_common.py", line 42, in _impl\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 687, in process\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 926, in _reconcile_existing_state\nKeyError: 'logStorage'\n",
"module_stdout": "{'environmentName': 'arcp-aw-env', 'crn': 'crn:cdp:environments:us-west-1:a0ec84c6-fee6-4e9c-acdc-68e1f49a5184:environment:714b0df3-e459-40ba-b722-837018456722', 'status': 'CREATE_FAILED', 'region': 'us-east-1', 'cloudPlatform': 'AWS', 'credentialName': 'arcp-aw-xaccount-cred', 'created': datetime.datetime(2021, 10, 14, 6, 53, 1, 412000, tzinfo=tzlocal())}\nsdf\nexisting\n{'environmentName': 'arcp-aw-env', 'crn': 'crn:cdp:environments:', 'status': 'CREATE_FAILED', 'region': 'us-east-1', 'cloudPlatform': 'AWS', 'credentialName': 'arcp-aw-xaccount-cred', 'created': datetime.datetime(2021, 10, 14, 6, 53, 1, 412000, tzinfo=tzlocal())}\narn:aws:iam:::instance-profile/arcp-logs-role\n",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 1
}
}

--skip-tags "database" doesn't function

@asdaraujo
We try to execute the deployer with --skip-tags "database" and it failed with the below error. Although we explicitly mentioned that Postages DB doesn't need to be install, the deployer tries to install a Postgres library.

ansible-playbook -i /runner/project/inventory_static.ini /runner/project/cloudera-deploy/main.yml -e "definition_path=/runner/project/" -e "abs_profile=/runner/project/profile.yml" -t full_cluster --skip-tags "database" -vvv

and this is the error message  The full traceback is:
WARNING: The below traceback may not be related to the actual failure.
File "/tmp/ansible_postgresql_user_payload_qTn8l4/ansible_postgresql_user_payload.zip/ansible_collections/community/postgresql/plugins/modules/postgresql_user.py", line 277, in
[WARNING]: Module remote_tmp /var/lib/pgsql/.ansible/tmp did not exist and was created with a mode of 0700, this may cause issues when running as another user. To avoid this, create
the remote_tmp dir with the correct permissions manually
fatal: [semicjs02-bi-1.int.semicjs02.nice.com -> semicjs02-bi-1.int.semicjs02.nice.com]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "ca_cert": null,
            "comment": null,
            "conn_limit": null,
            "db": "",
            "encrypted": true,
            "expires": null,
            "fail_on_user": true,
            "groups": null,
            "login_host": "",
            "login_password": "",
            "login_unix_socket": "",
            "login_user": "postgres",
            "name": "scm",
            "no_password_changes": false,
            "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
            "port": 5432,
            "priv": null,
            "role_attr_flags": "",
            "session_role": null,
            "ssl_mode": "prefer",
            "state": "present",
            "trust_input": true,
            "user": "scm"
        }
    },
    "msg": "Failed to import the required Python library (psycopg2) on semicjs02-bi-1's Python /usr/bin/python. Please read the module documentation and install it in the appropriate location. If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter"
}

cloudera-labs / cloudera-deploy Goto Github PK

cloudera-deploy's People

Contributors

Stargazers

Watchers

Forkers

cloudera-deploy's Issues

Context :

Issue :

Solution :

Recommend Projects

Recommend Topics

Recommend Org

Jobs