cloudera-labs / cloudera-deploy Goto Github PK
View Code? Open in Web Editor NEWA general purpose framework for automating Cloudera Products
License: Apache License 2.0
A general purpose framework for automating Cloudera Products
License: Apache License 2.0
Not sure I'm missing a simpler strategy, on howto prepare a new "cluster.yml" file required by this playbook (to be configured in the definition_path), from an exportable CDP (7.1.x / private-cloud) cluster ?
Would be great to get input from Cloudera guys :)
Alternatively, or in addition, it would be highly useful/helpful if much more advanced "cluster.yml" example would be added to the repo (for ex. to deploy a 3-master node HA cluster, as this was nicely done in the former HDP repo: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/group_vars/example-hdp-ha-3-masters-with-ranger-atlas )
Once I learn here in the community, there's indeed no such existing script .. I'm happy to write&contribute myself something
Minimal features:
Script Input / Output examples
{
"cdhVersion" : "7.1.4",
"displayName" : "Basic Cluster",
"cmVersion" : "7.1.4",
"repositories" : [ ... ],
"products" : [ {
"version" : "7.1.4-1.cdh7.1.4.p0.6300266",
"product" : "CDH"
} ],
"services" : [ {
"refName" : "zookeeper",
"serviceType" : "ZOOKEEPER",
"serviceConfigs" : [ {
"name" : "zookeeper_datadir_autocreate",
"value" : "true"
} ],
"roleConfigGroups" : [ {
"refName" : "zookeeper-SERVER-BASE",
"roleType" : "SERVER",
"configs" : [ {
"name" : "zk_server_log_dir",
"value" : "/var/log/zookeeper"
}, {
"name" : "dataDir",
"variable" : "zookeeper-SERVER-BASE-dataDir"
}, {
"name" : "dataLogDir",
"variable" : "zookeeper-SERVER-BASE-dataLogDir"
} ],
"base" : true
} ]
},
...
} ],
"hostTemplates" : [ {
"refName" : "HostTemplate-0-from-eval-cdp-public[1-3].internal.cloudapp.net",
"cardinality" : 3,
"roleConfigGroupsRefNames" : [ "hdfs-DATANODE-BASE", "spark_on_yarn-GATEWAY-BASE", "yarn-NODEMANAGER-BASE" ]
}, {
"refName" : "HostTemplate-1-from-eval-cdp-public0.internal.cloudapp.net",
"cardinality" : 1,
"roleConfigGroupsRefNames" : [ "hdfs-NAMENODE-BASE", "hdfs-SECONDARYNAMENODE-BASE", "spark_on_yarn-GATEWAY-BASE", "spark_on_yarn-SPARK_YARN_HISTORY_SERVER-BASE", "yarn-JOBHISTORY-BASE", "yarn-RESOURCEMANAGER-BASE", "zookeeper-SERVER-BASE" ]
} ],
...
Output file, following the format of cluster.yml, for ex: roles/cloudera_deploy/defaults/basic_cluster.yml
clusters:
- name: Basic Cluster
services: [HDFS, YARN, ZOOKEEPER]
repositories:
- https://archive.cloudera.com/cdh7/7.1.4.0/parcels/
configs:
ZOOKEEPER:
SERVICEWIDE:
zookeeper_datadir_autocreate: true
zk_server_log_dir": "/var/log/zookeeper"
HDFS:
DATANODE:
dfs_data_dir_list: /dfs/dn
NAMENODE:
dfs_name_dir_list: /dfs/nn
...
host_templates:
Master1:
HDFS: [NAMENODE, SECONDARYNAMENODE]
YARN: [RESOURCEMANAGER, JOBHISTORY]
ZOOKEEPER: [SERVER]
Workers:
HDFS: [DATANODE]
YARN: [NODEMANAGER]
Feature Request
It would be nice to have different log levels and the logging to be more verbose by default vs whats output when the ansible scripts actually run.
This would make it easier to check if a specific step is executing or has stopped. , especially important when during creation of VM's that take longer to finish.
it's a great tool,
but where can I find the password of kerberos users if kerberos=true in definition.yml?
thanks in advance
the task "Install Cloudera Manager agents" located on cluster.yml is missing the tag "full_cluster". Not sure if this was made intentionally or not, but as I get used to use the tag "full_cluster" to deploy on premisse, now my scripts are failing.
I tried to run the playbook in DEBUG mode and found that a local ssh connection failed:
[email protected]: Permission denied (publickey,password)
Turned out that in cloudera-deploy ~/.ssh is mounted under /home/runner/.ssh and HOME is set to be /home/runner… but in debug mode some bits still depend on /root/.ssh/.
Copying content from /home/runner/.ssh/ to /root/.ssh/ solved these issues.
TASK [cloudera.cluster.repometa : Download parcel manifest information url={{ repository | regex_replace('/?$','') + '/manifest.json' }}, status_code=200, body_format=json, retu
rn_content=True, url_username={{ parcel_repo_username | default(omit) }}, url_password={{ parcel_repo_password | default(omit) }}] ***
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/deployment/repometa/tasks/parcels.yml:17
Monday 31 May 2021 07:32:09 +0000 (0:00:00.111) 0:00:24.509 ************
11741 1622446329.37654: sending task start callback
11741 1622446329.37659: entering _queue_task() for localhost/uri
...
<127.0.0.1> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbas
ed,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o ControlPath=/home/runner/.ansible/cp/21f0e6a9ae 127.0.0.1 '/bin/sh -c '"'"'echo ~root && sleep 0'"'"''
12412 1622446329.49261: stderr chunk (state=2):
>>>OpenSSH_8.0p1, OpenSSL 1.1.1g FIPS 21 Apr 2020
...
12412 1622446345.11344: stderr chunk (state=3):
>>>debug3: authmethod_lookup publickey
debug3: remaining preferred: ,gssapi-keyex,hostbased,publickey
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Trying private key: /root/.ssh/id_rsa
debug3: no such identity: /root/.ssh/id_rsa: No such file or directory
...
[email protected]: Permission denied (publickey,password).```
As the pre-req to run cloudera-deploy
, I have installed the default docker version on CentOS 7.9 as follows:
yum install docker
That's the version that was installed:
# docker --version
Docker version 1.13.1, build 7d71120/1.13.1
Now, while trying to run quickstart.sh
- I got the following error:
./quickstart.sh
Checking if Docker is running...
Docker OK
Trying to pull repository ghcr.io/cloudera-labs/cldr-runner ...
full-latest: Pulling from ghcr.io/cloudera-labs/cldr-runner
Digest: sha256:01504a335c7fe1c29ba695ca996b464be28925a545f7f2b5bb1c1624e145e208
Status: Image is up to date for ghcr.io/cloudera-labs/cldr-runner:full-latest
Ensuring default credential paths are available in calling using profile for mounting to execution environment
Ensure Default profile is present
Custom Cloudera Collection path not found
Mounting /home/ansible to container as Project Directory /runner/project
Creating Container cloudera-deploy from image ghcr.io/cloudera-labs/cldr-runner:full-latest
Checking OS
SSH authentication for container taken from /tmp/ssh-bAkDNAJcUgfV/agent.9389
Creating new execution container named 'cloudera-deploy'
unknown flag: --mount
See 'docker run --help'.
The above version does not recognise "--mount" flag.
It looks like "--mount" was introduced only in Docker 17.05
https://docs.docker.com/engine/release-notes/17.05/#client
When running the quickstart.sh
script on an Ubuntu WSL session on Windows, the SSH_AUTH_SOCK
variable isn't set as the ssh-agent
isn't started by default.
$ ./quickstart.sh
Checking if Docker is running...
Docker OK
full-latest: Pulling from cloudera-labs/cldr-runner
Digest: sha256:15442500076f42918fd82f5f94cf0aaf4564aa235bd66b47edb2ec052e099e59
Status: Image is up to date for ghcr.io/cloudera-labs/cldr-runner:full-latest
ghcr.io/cloudera-labs/cldr-runner:full-latest
Ensuring default credential paths are available in calling using profile for mounting to execution environment
Ensure Default profile is present
Custom Cloudera Collection path not found
Mounting /mnt/c/Users/jeff/tmp to container as Project Directory /runner/project
Creating Container cloudera-deploy from image ghcr.io/cloudera-labs/cldr-runner:full-latest
Checking OS
SSH_AUTH_SOCK is empty or not set, unable to proceed. Exiting
One possible fix would be to add a line to the script to check and start the ssh-agent
. But running ssh-agent
directly doesn't set the SSH_AUTH_SOCK
variable as it needs to be wrapped in an eval
.
Would it be possible to add something like the following to the quickstart.sh
script?
if pgrep -x "ssh-agent" >/dev/null
then
echo "ssh-agent is running"
else
echo "ssh-agent stopped"
eval `ssh-agent -s`
fi
I tried adding it to the start of the quickstart.sh
script and it worked fine.
How to use tags like infra
with the v2 (i.e. the cloudera.exe
playbooks) -- the legacy -t run
is no longer valid.
The Ansible script creates and connects to newly formed EC2 instances and while connecting to each VM it asks the trust question.
We should have a flag to automate this. A standard -y
to continue un-interrupted would allow the creation and deployment without constant user interaction.
The document link in the cloudera deploy instruction page, in the location:
If you do not have CDP user click here , produces a broken link : This is the link used
I was able to deploy the cdp private without any credentials, only cdp license has been used.
I am trying to teardown our deployed cluster via tags -t teardown,all
. however it fails with this missing credetials error.
TASK [cloudera.exe.runtime : Refresh Environment Info with Descendants] ****************************************************************************************************
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/exe/roles/runtime/tasks/initialize_teardown.yml:17
Friday 11 November 2022 13:39:06 +0000 (0:00:00.069) 0:00:08.557 *******
fatal: [localhost]: FAILED! => {"changed": false, "error": "{'base_error': NoCredentialsError('Unable to locate CDP credentials: No credentials found anywhere in chain. The shared credentials file should be stored at /home/runner/.cdp/credentials.'), 'ext_traceback': [' File \"/root/.ansible/tmp/ansible-tmp-1668173946.776787-24441-170028905131803/AnsiballZ_env_info.py\", line 102, in <module>\\n _ansiballz_main()\\n', ' File \"/root/.ansible/tmp/ansible-tmp-1668173946.776787-24441-170028905131803/AnsiballZ_env_info.py\", line 94, in _ansiballz_main\\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\\n', ' File \"/root/.ansible/tmp/ansible-tmp-1668173946.776787-24441-170028905131803/AnsiballZ_env_info.py\", line 40, in invoke_module\\n runpy.run_module(mod_name=\\'ansible_collections.cloudera.cloud.plugins.modules.env_info\\', init_globals=None, run_name=\\'__main__\\', alter_sys=True)\\n', ' File \"/usr/lib64/python3.8/runpy.py\", line 207, in run_module\\n return _run_module_code(code, init_globals, run_name, mod_spec)\\n', ' File \"/usr/lib64/python3.8/runpy.py\", line 97, in _run_module_code\\n _run_code(code, mod_globals, init_globals,\\n', ' File \"/usr/lib64/python3.8/runpy.py\", line 87, in _run_code\\n exec(code, run_globals)\\n', ' File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 471, in <module>\\n', ' File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 461, in main\\n', ' File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 424, in __init__\\n', ' File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/module_utils/cdp_common.py\", line 42, in _impl\\n result = f(self, *args, **kwargs)\\n', ' File \"/tmp/ansible_cloudera.cloud.env_info_payload_51viniow/ansible_cloudera.cloud.env_info_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env_info.py\", line 429, in process\\n', ' File \"/usr/local/lib/python3.8/site-packages/cdpy/environments.py\", line 55, in describe_environment\\n resp = self.sdk.call(\\n', ' File \"/usr/local/lib/python3.8/site-packages/cdpy/common.py\", line 594, in call\\n parsed_err = CdpError(err)\\n'], 'error_code': None, 'violations': None, 'message': None, 'status_code': None, 'rc': None, 'service': None, 'operation': None, 'request_id': None}", "msg": "None", "violations": null}
quickstart.sh
includes the following checks:
[...]
if [ -n "${CLDR_PYTHON_PATH}" ]; then
echo "Path to custom Python sourcecode supplied as ${CLDR_PYTHON_PATH}, setting as System PYTHONPATH"
PYTHONPATH="${CLDR_PYTHON_PATH}"
else
echo "'CLDR_PYTHON_PATH' is not set, skipping setup of PYTHONPATH in execution container"
fi
echo "Checking if ssh-agent is running..."
if pgrep -x "ssh-agent" >/dev/null
then
echo "ssh-agent OK"
else
echo "ssh-agent is stopped, please start it by running: eval `ssh-agent -s` "
#eval `ssh-agent -s`
fi
echo "Checking OS"
if [ ! -f "/run/host-services/ssh-auth.sock" ];
then
if [ -n "${SSH_AUTH_SOCK}" ];
then
SSH_AUTH_SOCK=${SSH_AUTH_SOCK}
else
echo "ERROR: SSH_AUTH_SOCK is empty or not set, unable to proceed. Exiting"
exit 1
fi
else
SSH_AUTH_SOCK=${SSH_AUTH_SOCK}
fi
[...]
The first check looks for an env var being set, and expliticly skips some setup if that env var is not found, i.e. that setting is explicitly optional. Fine.
The third check looks for an env var being set, and explicitly exits with an error if it's not present. Also fine.
The second check is to see if ssh-agent
is running. If it is not running, the code looks like it intends to print a helpful error message for the user, to empower the user to manually do something before coming back to try this script again. There are a couple of issues here:
ssh-agent
is required, shouldn't the else
block here explicitly exit 1
, like the next check for $SSH_AUTH_SOCK
does? (Otherwise, the intention would be clearer if it stated explicitly that it's carrying on regardless, like the first check does, but I don't think that's the idea here).ssh-agent -s
command. Pretty sure that's not what's intended, since in that case the actual message to the user doesn't make sense. Also, judging by the fact the next line is a commented-out version of the same command, seems it was considered to do automatically, but chose not to. Again, this is where I think an explicit exit 1
should be here instead.If my read is right, happy to submit a PR?
cloudera-deploy/centos7-init.sh
Line 26 in a2737e4
cloudera-deploy/centos7-init.sh
Line 57 in a2737e4
cloudera-deploy/centos7-init.sh
Line 59 in a2737e4
Error message:
The full traceback is: Traceback (most recent call last): File "/home/centos/.local/lib/python3.6/site-packages/ansible/executor/task_executor.py", line 585, in _execute self._task.post_validate(templar=templar) File "/home/centos/.local/lib/python3.6/site-packages/ansible/playbook/task.py", line 307, in post_validate super(Task, self).post_validate(templar) File "/home/centos/.local/lib/python3.6/site-packages/ansible/playbook/base.py", line 431, in post_validate value = templar.template(getattr(self, name)) File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 840, in template disable_lookups=disable_lookups, File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 795, in template disable_lookups=disable_lookups, File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 1057, in do_template res = j2_concat(rf) File "<template>", line 14, in root File "/home/centos/.local/lib/python3.6/site-packages/ansible/template/__init__.py", line 255, in wrapper ret = func(*args, **kwargs) File "/home/centos/.local/lib/python3.6/site-packages/ansible/plugins/filter/core.py", line 209, in from_yaml return yaml.safe_load(data) File "/home/centos/.local/lib/python3.6/site-packages/yaml/__init__.py", line 162, in safe_load return load(stream, SafeLoader) File "/home/centos/.local/lib/python3.6/site-packages/yaml/__init__.py", line 114, in load return loader.get_single_data() File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 51, in get_single_data return self.construct_document(node) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 60, in construct_document for dummy in generator: File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 413, in construct_yaml_map value = self.construct_mapping(node) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 218, in construct_mapping return super().construct_mapping(node, deep=deep) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 143, in construct_mapping value = self.construct_object(value_node, deep=deep) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 100, in construct_object data = constructor(self, node) File "/home/centos/.local/lib/python3.6/site-packages/yaml/constructor.py", line 429, in construct_undefined node.start_mark) yaml.constructor.ConstructorError: could not determine a constructor for the tag '!vault' in "<unicode string>", line 395, column 26: krb5_kdc_admin_password: !vault |
There appears to be a bug where, if you are using dynamic inventory and an absolute path to your definition, it fails to use the default inventory path.
e.g. ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=/opt/cloudera-deploy/examples/c5secure" -t infra,full_cluster
Workaround is to pass in the correct inventory path with -i /runner/inventory
e.g.
ansible-playbook /opt/cloudera-deploy/main.yml -e "definition_path=/opt/cloudera-deploy/examples/c5secure" -i /runner/inventory -t infra,full_cluster
Print all Cloudera Manager
nodes deployed to the log at the end of the run.
I began to have a huge unreadable definition file, so i wanted to use ansible variables.
I discovered that the definition file was loaded as a simple file, parsed as-is, in yaml. That means variables won't be interpreted (i.e. host_templates: "{{ cdp_host_templates }}"
will be interpred as string : "{{ cdp_host_templates }}"
)
My definition file looks like this :
clusters:
- name: "{{ cdp_cluster_name }}"
type: base
services: "{{ cdp_services }}"
databases: "{{ cdp_databases }}"
configs: "{{ fresh_install_configs }}"
host_templates: "{{ cdp_host_templates }}"
...
Using variables in the defintion file raises an error due to a check done on the host_templates here
The error message is : Unable to host template {{ host_template }} in the cluster definition
This check is trying to find a host template named 'xxx' in : host_templates: "{{ host_templates }}"
and fails because it is interpreted as a string...
The tasks reponsible for this is here
Use include_vars
instead of lookup(file...)
The call which checks on which ec2 instance types are available relies on AWS CLI as there isn't an Ansible collection call for it, the task which parses the output assumes it will be json and it fails if the user has set a non-json output in their AWS profile.
Task failing is cloudera.exe.infrastructure: Check required AWS EC2 Instance Types
Run level doc link in the last line of "Tags" section is broken.
after deployment of the CDP private cluster and change of definition.yml
for example hdfs failed volumes tolerated, like:
configs:
HDFS:
DATANODE:
dfs_datanode_failed_volumes_tolerated: 3
when running the playbook again, no parameters are changed.
While testing the deployment in the internal (cloudcat) and external environment - ran into the following issue:
TASK [cloudera.cluster.ca_server : Generate root private key] **************************************************************************************** Friday 28 January 2022 00:32:45 +0000 (0:00:00.854) 0:08:16.774 ******** fatal: [cla-tt-2a-mas1.clatest.telstraglobal.net]: FAILED! => {"changed": false, "msg": "Can't detect the required Python library cryptography (>= 1.2.3)"}
Check inside of the Docker:
cldr full-v1.5.3 #> pip show cryptography
Name: cryptography
Version: 3.3.2
Summary: cryptography is a package which provides cryptographic recipes and primitives to Python developers.
Home-page: https://github.com/pyca/cryptography
Author: The cryptography developers
Author-email: [email protected]
License: BSD or Apache License, Version 2.0
Location: /usr/local/lib64/python3.8/site-packages
Requires: cffi, six
Required-by: adal, ansible-base, azure-cli-core, azure-identity, azure-keyvault, azure-storage, msal, openstacksdk, paramiko, pyOpenSSL, pypsrp, pyspnego, requests-credssp, requests-ntlm
CDP Password requirements not checked / enforced upfront.
Quickstart depends on cdp cli
for creating a CDP environment and requires a specific password standard . This is not enforced or checked upfront, thereby failing later if an incorrect password is set.
The CDP password requirements must be checked upfront before cluster creation.
Tried to install CDP private cloud. I updated the inventory file to include all my hosts, but during the deployment it still tries to connect to AWS and complains it does not have the appropriate credentials:
TASK [cloudera_deploy : Get AWS Account Info] ****************************************************************************************************************
Thursday 20 May 2021 22:45:31 +0000 (0:00:00.036) 0:00:38.058 **********
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: botocore.exceptions.NoCredentialsError: Unable to locate credentials
fatal: [localhost]: FAILED! => {"boto3_version": "1.17.66", "botocore_version": "1.20.66", "changed": false, "msg": "Failed to retrieve caller identity: Unable to locate credentials"}
How do I tell it this is a private cloud installation that has nothing to do with AWS ?
EZMode Documentation
Require better, more prescriptive steps that can be cut and pasted without having to read it .
The Credentials section of the README is missing a key piece of information -- you need to be in a project with cldr-runner
-enabled ansible-navigator.yml
configuration file.
Line 69 in a14eefd
Ansible will pause execution to gain interactive assent to connect to hosts not in the known_hosts file during a typical dynamic inventory execution, as such it is normal to disable host key checking for a smooth default experience
https://docs.ansible.com/ansible/latest/user_guide/connection_details.html
The link to CDP CLI in the readme page is broken. This is the text :
Visit the CDP CLI User Guide for further details regarding credential management. The link to user guide is broken
Make the SSH_AUTH_SOCK implementation O.S agnostic
The current implementation of SSH_AUTH_SOCK is O.S specific specifically OSX specific.
The below hard-codes the target path that works only for OSX.
--mount type=bind,src=$SSH_AUTH_SOCK,target=/run/host-services/ssh-auth.sock \
Change this to make it work for all OS's , specifically Linux.
tested on mac osx...
ansible-navigator using a Py3.8 env is unable to parse ansible-navigator.yml files as in our public-cloud/aws/ examples
when used with Py3.11 env, it was able to use the ansible-navigator.yml setting file
did not test any other versions of Python
Hello, I am deploying the CDP private Basic Cluster via definition.yml file.
I have managed to add kerberos via adding the following code as an parameter in the basic cluster
security:
kerberos: true
I am looking for a solution of setting proxy parameters in definiton.yml
I think it should look like this:
configs:
parcel_proxy_port: 1234
parcel_proxy_server: my_beautiful_proxy.com
But in the configs section of the cluster it expects service name, I need to specify somehow that it's CM i suppose..
I have tried to add it into mgmt as well, it does not recognized those options.
Trying to deploy CDP private cluster with kerberos, ranger and autotls.
playbook execution command:
ansible-playbook /runner/project/cloudera-deploy/main.yml -e "definition_path=/runner/project/cloudera-deploy/examples/sandbox" -e "profile=/home/runner/.config/cloudera-deploy/profiles/default" -t default_cluster,kerberos,tls -i "/runner/project/cloudera-deploy/examples/sandbox/inventory_static.ini" --flush-cache
After execution, playbook fails on the task:
TASK [cloudera.cluster.tls_install_certs : Install signed certificate reply into keystore] ***
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/security/tls_install_certs/tasks/main.yml:126
with error below (on each node)
[ "cmd": "/usr/bin/keytool -importcert -alias \"node1.domain.com\" -file \"/opt/cloudera/security/pki/node1.domain.com.pem\" -keystore \"/opt/cloudera/security/pki/node1.domain.com.jks\" -storepass \"changeme\" -trustcacerts -noprompt](fatal: [node1.domain.com]: FAILED! => {"changed": false, "cmd": "/usr/bin/keytool -importcert -alias \"node1.domain.com\" -file \"/opt/cloudera/security/pki/node1.domain.com.pem\" -keystore \"/opt/cloudera/security/pki/node1.domain.com.jks\" -storepass \"changeme\" -trustcacerts -noprompt\n", "delta": "0:00:00.247693", "end": "2023-01-09 13:27:30.366003", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2023-01-09 13:27:30.118310", "stderr": "", "stderr_lines": [], "stdout": "keytool error: java.lang.Exception: Public keys in reply and keystore don't match", "stdout_lines": ["keytool error: java.lang.Exception: Public keys in reply and keystore don't match"]})
Any idea why is this happening ?
I have tried to import certs manually via
/usr/bin/keytool -importcert -alias node1.domain.com -file /opt/cloudera/security/pki/node1.domain.com.pem -keystore /opt/cloudera/security/pki/node1.domain.com.jks -trustcacerts -noprompt
And the cert have been added successfully...
It seems that commit 0526f52 introduced some unqualified variables that are causing the execution to fail with the following error:
TASK [cloudera_deploy : Check Supplied terraform_base_dir variable] ************
task path: /runner/project/cloudera-deploy/roles/cloudera_deploy/tasks/init.yml:232
fatal: [localhost]: FAILED! => {
"msg": "The conditional check 'infra_deployment_engine == 'terraform'' failed. The error was: error while evaluating conditional (infra_deployment_engine == 'terraform'): 'infra_deployment_engine' is undefined\n\nThe error appears to be in '/runner/project/cloudera-deploy/roles/cloudera_deploy/tasks/init.yml': line 232, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Check Supplied terraform_base_dir variable\n ^ here\n"
}
Hi,
I am getting the following error while executing /opt/cldr-runner/collections/ansible_collections/cloudera/cloud/plugins/modules/env.py
from /opt/cldr-runner/collections/ansible_collections/cloudera/exe/roles/platform/tasks/setup_aws_env.yml
below is the error message
*
Monday 25 October 2021 17:08:40 +0000 (0:00:02.728) 0:01:33.504 ********
ok: [localhost] => {
"msg": {
"changed": false,
"exception": "Traceback (most recent call last):\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 102, in \n _ansiballz_main()\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 40, in invoke_module\n runpy.run_module(mod_name='ansible_collections.cloudera.cloud.plugins.modules.env', init_globals=None, run_name='main', alter_sys=True)\n File "/usr/lib64/python3.8/runpy.py", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File "/usr/lib64/python3.8/runpy.py", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File "/usr/lib64/python3.8/runpy.py", line 87, in _run_code\n exec(code, run_globals)\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1055, in \n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1045, in main\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 662, in init\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/module_utils/cdp_common.py", line 42, in _impl\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 687, in process\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 926, in _reconcile_existing_state\nKeyError: 'logStorage'\n",
"failed": true,
"module_stderr": "Traceback (most recent call last):\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 102, in \n _ansiballz_main()\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File "/root/.ansible/tmp/ansible-tmp-1635181717.7778952-26659-167799228659899/AnsiballZ_env.py", line 40, in invoke_module\n runpy.run_module(mod_name='ansible_collections.cloudera.cloud.plugins.modules.env', init_globals=None, run_name='main', alter_sys=True)\n File "/usr/lib64/python3.8/runpy.py", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File "/usr/lib64/python3.8/runpy.py", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File "/usr/lib64/python3.8/runpy.py", line 87, in _run_code\n exec(code, run_globals)\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1055, in \n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 1045, in main\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 662, in init\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/module_utils/cdp_common.py", line 42, in _impl\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 687, in process\n File "/tmp/ansible_cloudera.cloud.env_payload_69lmx119/ansible_cloudera.cloud.env_payload.zip/ansible_collections/cloudera/cloud/plugins/modules/env.py", line 926, in _reconcile_existing_state\nKeyError: 'logStorage'\n",
"module_stdout": "{'environmentName': 'arcp-aw-env', 'crn': 'crn:cdp:environments:us-west-1:a0ec84c6-fee6-4e9c-acdc-68e1f49a5184:environment:714b0df3-e459-40ba-b722-837018456722', 'status': 'CREATE_FAILED', 'region': 'us-east-1', 'cloudPlatform': 'AWS', 'credentialName': 'arcp-aw-xaccount-cred', 'created': datetime.datetime(2021, 10, 14, 6, 53, 1, 412000, tzinfo=tzlocal())}\nsdf\nexisting\n{'environmentName': 'arcp-aw-env', 'crn': 'crn:cdp:environments:', 'status': 'CREATE_FAILED', 'region': 'us-east-1', 'cloudPlatform': 'AWS', 'credentialName': 'arcp-aw-xaccount-cred', 'created': datetime.datetime(2021, 10, 14, 6, 53, 1, 412000, tzinfo=tzlocal())}\narn:aws:iam:::instance-profile/arcp-logs-role\n",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 1
}
}
@asdaraujo
We try to execute the deployer with --skip-tags "database" and it failed with the below error. Although we explicitly mentioned that Postages DB doesn't need to be install, the deployer tries to install a Postgres library.
ansible-playbook -i /runner/project/inventory_static.ini /runner/project/cloudera-deploy/main.yml -e "definition_path=/runner/project/" -e "abs_profile=/runner/project/profile.yml" -t full_cluster --skip-tags "database" -vvv
and this is the error message The full traceback is:
WARNING: The below traceback may not be related to the actual failure.
File "/tmp/ansible_postgresql_user_payload_qTn8l4/ansible_postgresql_user_payload.zip/ansible_collections/community/postgresql/plugins/modules/postgresql_user.py", line 277, in
[WARNING]: Module remote_tmp /var/lib/pgsql/.ansible/tmp did not exist and was created with a mode of 0700, this may cause issues when running as another user. To avoid this, create
the remote_tmp dir with the correct permissions manually
fatal: [semicjs02-bi-1.int.semicjs02.nice.com -> semicjs02-bi-1.int.semicjs02.nice.com]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"ca_cert": null,
"comment": null,
"conn_limit": null,
"db": "",
"encrypted": true,
"expires": null,
"fail_on_user": true,
"groups": null,
"login_host": "",
"login_password": "",
"login_unix_socket": "",
"login_user": "postgres",
"name": "scm",
"no_password_changes": false,
"password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"port": 5432,
"priv": null,
"role_attr_flags": "",
"session_role": null,
"ssl_mode": "prefer",
"state": "present",
"trust_input": true,
"user": "scm"
}
},
"msg": "Failed to import the required Python library (psycopg2) on semicjs02-bi-1's Python /usr/bin/python. Please read the module documentation and install it in the appropriate location. If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter"
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.