GithubHelp home page GithubHelp logo

cloudera-labs / cloudera.cluster Goto Github PK

View Code? Open in Web Editor NEW
32.0 4.0 46.0 1.24 MB

An Ansible collection for lifecycle and management of Cloudera CDP Private Cloud resources on bare metal, IaaS, and PaaS.

License: Apache License 2.0

Python 32.36% Jinja 63.41% Shell 1.13% C 3.11%
cloudera-manager cdp-private-cloud ansible collections ansible-collection

cloudera.cluster's Introduction

cloudera.cluster - Cloudera Data Platform (CDP) for Private Cloud and Cloudera Manager (CM)

API documentation

cloudera.cluster is an Ansible collection that lets you manage your Cloudera Data Platform (CDP) Private Cloud resources and interact with Cloudera Manager for both Private Cloud installations and Public Cloud Data Hub deployments. With this collection, you can:

  • Create and manage Private Cloud deployments and Public Cloud Data Hubs, including:
    • Manage services like Impala, NiFi, and Ozone
    • Configure Cloudera Manager and cm_agent-enabled hosts

If you have any questions, want to chat about the collection's capabilities and usage, need help using the collection, or just want to stay updated, join us at our Discussions.

Quickstart

  1. Install the collection
  2. Install the requirements
  3. Use the collection

API

See the API documentation for details for each plugin and role within the collection.

Roadmap

If you want to see what we are working on or have pending, check out:

Are we missing something? Let us know by creating a new issue or posting a new idea!

Contribute

For more information on how to get involved with the cloudera.cluster Ansible collection, head over to CONTRIBUTING.md.

Installation

To install the cloudera.cluster collection, you have several options. Please note that we have not yet published this collection to the public Ansible Galaxy server, so you cannot install it via direct namespace, rather you must specify by Git project and (optionally) branch.

Option #1: Install from GitHub

Create or edit your requirements.yml file in your project with the following:

collections:
  - name: https://github.com/cloudera-labs/cloudera.cluster.git
    type: git
    version: main

And then run in your project:

ansible-galaxy collection install -r requirements.yml

You can also install the collection directly:

ansible-galaxy collection install git+https://github.com/cloudera-labs/cloudera.cluster.git@main

Option #2: Install the tarball

Periodically, the collection is packaged into a distribution which you can install directly:

ansible-galaxy collection install <collection-tarball>

See Building the Collection for details on creating a local tarball.

Requirements

cloudera.cluster expects ansible-core>=2.10,<2.13.

Warning

The current import_template functionality does not yet work with Ansible version 2.13 and later.

The collection has the following required dependencies:

Name Type Version
ansible.posix collection 1.3.0
community.crypto collection 2.2.1
community.general collection 4.5.0

There are a number of optional dependencies for the collection:

Name Type Version
community.mysql collection 3.1.0
community.postgresql collection 1.6.1
freeipa.ansible_freeipa collection 1.11.1
geerlingguy.postgresql role 2.2.0
geerlingguy.mysql (patched) role master

The collection also requires the following Python libraries to operate its modules:

The collection's Python dependencies alone, not the required Python libraries of its collection dependencies, are in requirements.txt.

All collection dependencies, required and optional, can be found in requirements.yml; only the required dependencies are in galaxy.yml. ansible-galaxy will install only the required collection dependencies; you will need to add the optional collection dependencies as needed (see above).

ansible-builder can discover and install all Python dependencies - current collection and dependencies - if you wish to use that application to construct your environment. Otherwise, you will need to read each collection and role dependency and follow its installation instructions.

See the Collection Metadata section for further details on how to install (and manage) collection dependencies.

You may wish to use a virtual environment to manage the Python dependencies.

See the base Execution Environment configuration in cloudera-labs/cldr-runner as an example of how you can install the optional dependencies to suit your specific needs.

Using the Collection

This collection is designed to work hand-in-hand with the cloudera-deploy application, which uses reference playbooks from the cloudera.exe collection and example definitions. Coming releases will decouple these collections further while maintaining backwards compatibility.

Once installed, reference the collection in your playbooks and roles.

For example, here we use the cloudera.cluster.cm_resource module to patch the Hue service with updated Knox proxy hosts:

- hosts: localhost
  connection: local
  gather_facts: no
  vars:
    cm_api:  "{{ lookup('ansible.builtin.env', 'CM_API') }}"
    user:    "{{ lookup('ansible.builtin.env', 'CM_USERNAME') }}"
    pwd:     "{{ lookup('ansible.builtin.env', 'CM_PASSWORD') }}"
    cluster: "my-cluster"
  tasks:
    - name: Update Hue SSO (Knox Proxies)
      cloudera.cluster.cm_resource:
        url: "{{ cm_api }}"
        username: "{{ user }}"
        password: "{{ pwd }}"
        path: "v51/clusters/{{ cluster }}/services/hue/config"
        method: PUT
        parameters:
          message: "Patch Knox proxy hosts for Hue (Ansible)"
        body:
          items:
            - name: knox_proxyhosts
              value: "{{ ['master1', 'master2', 'master3'] | join(',') }}"

Building the Collection

To create a local collection tarball, run:

ansible-galaxy collection build 

Building the API Documentation

To create a local copy of the API documentation, first make sure the collection is in your ANSIBLE_COLLECTIONS_PATHS. Then run the following:

# change into the /docsbuild directory
cd docsbuild

# install the build requirements (antsibull-docs); you may want to set up a
# dedicated virtual environment
pip install ansible-core https://github.com/cloudera-labs/antsibull-docs/archive/cldr-docsite.tar.gz

# Install the collection's build dependencies
pip install requirements.txt

# Then run the build script
./build.sh

Your local documentation will be found at docsbuild/build/html.

Tested Platforms

Active development is focused on CDP Private Cloud deployments and their respective platform compatibility matrices.

Note

While the collection's plugins and roles can be used to deploy CDH 5.x and CDH 6.x environments, it is only possible to install a subset of their supported platform components (i.e JDK and database versions) using this tooling.

Cloudera Distributions

  • Cloudera Manager / CDP Private Cloud Base 7.1.x
  • Cloudera Manager / CDP Private Cloud Base 7.0.3 (limited support)
  • Cloudera Manager / CDH 6.x
  • Cloudera Manager / CDH 5.x (limited support)

Operating Systems

  • Red Hat / CentOS 7.x
  • Red Hat / CentOS 8.x
  • Ubuntu 18.04 LTS (Bionic Beaver)
  • Ubuntu 20.04 LTS (Focal Fossa)

Operational Features

Warning

These operational features are deprecated as of version 4.x. If you want to use or build similar features and functions, head over to the Discussions to learn more about using the collection to achieve your platform operations needs.

This collection includes support for:

  • Upgrading Cloudera Manager Server and Cloudera Manager Agents
  • Upgrading CDH 5 and/or CDH6 to CDP Private Cloud Base
  • Refreshing the config for running clusters, including adding new services or updating the config of existing services.

These features are potentially very dangerous and can cause damage to running clusters if used incorrectly. If you plan to use these features, please ensure that you test thoroughly on a disposable environment.

Cloudera recommends that Cloudera Professional Services be engaged before using these features, particularly as none of these operational features are covered under Cloudera Support agreements.

In order to use these capabilities you will need some permutation of the following variables:

  • cloudera_runtime_pre_upgrade (specify the version of the legacy cluster - e.g. 5.16.2)
  • update_services (true if you want to update the config of existing services)
  • upgrade_kts_cluster (true to upgrade a kts cluster)
  • activate_runtime_upgrade (true to do a patch release activation)
  • cdh_cdp_upgrade (true to do a CDH to CDP upgrade)
  • upgrade_runtime (true to upgrade between versions of CDH or CDP)

License and Copyright

Copyright 2023, Cloudera, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

cloudera.cluster's People

Contributors

anisf avatar asdaraujo avatar chaffelson avatar clevesque avatar jimright avatar rsuplina avatar tmgstevens avatar willdyson avatar wmudge avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

cloudera.cluster's Issues

nav2atlas_dir is unreadable

/var/lib/atlas/nav2atlas-data/ does not have the good permissions. It is not even readable by atlas :
d-w-rwxr-T
instead of
drwx------

HBase config property does not exist

When upgrading from CDH 6 to CDP, after CM Upgrade to 7.6.5, we encounter this error regarding HBase config : Unknown configuration attribute 'hadoop_secure_web_ui' for service (type: 'HBASE', name: 'hbase')

This comes from here : hadoop_secure_web_ui: true

IMO, this template should not be included, unless the actual runtime is on CDP 7. While the condition used to include or not Kerberos configs for HBase on CDP 7 is based on Cloudera Manager version. Code here : cloudera_manager_version is version('7.1.0','>=')

Update AWS_S3 service validation

Discussed in #156

Originally posted by hadoopch November 3, 2023
Hi all,

AWS_S3 is different than the other services. You don't assign it to any host.
Furthermore you don't have instance related scope of configuration (SERVICEWIDE, SERVER, ...). Therefore it is not clear how to define it in the definition.yml.

Verify will not accept a defintion without a host template.
I tried it via --skip-tags verify. Service was added but the configuration was not applied.

    configs:
      AWS_S3:
          cloud_account: obsbackup
          s3_endpoint: "obs.mydomain.com"
          AWS_S3_service_env_safety_valve: "path_style_access=true\nsigning_algorithm=S3SignerType\nconnection_ssl_enabled=false\nmultiobjectdelete_enable=true\nfast_upload=true\nendpoint_region=eu-ch2"
      INFRA_SOLR:
        SOLR_SERVER:
          solr_hdfs_blockcache_slab_count: 93
          process_auto_restart: true
          solr_java_heapsize: "{{12*1024**3}}"
          solr_java_direct_memory_size: "{{16*1024**3}}"

Who can support ?

Best regards

Uli

Update cm_utils.py to not contain controller-only code

Current version of module_utils/cm_utils.py has both controller-only code (for the base Lookup class) and general code (for the base Module class). Any module that extends the latter will fail to run in a non-controller environment since the imports like ansible.utils.display are not available. Need to split out the controller-only into its own file.

KMS wrongly marked as optional

  1. kms tasks are run after cluster tasks. Meaning you cannot upgrade from CDH (KT KMS + KTS) to CDP (Ranger KMS with KTS) without specifying Ranger KMS configuration : Upgrade fails because of missing parameters (KTS Org, KTS Auth Code, KTS Active Server). So this is mandatory to specify those in cluster services, and services configuration.
    Yet the documentation says : **It is not necessary to include KMS services in the cluster service list or host templates.** These will be added for you automatically.

  2. If you want to create Ranger KMS DB, it has to be in the cluster services list

Cluster template import error

Hi @asdaraujo

We are trying to update the attached Hive configuration through the definition.yml file and it fails with an error "Failed to Create Hive Metastore Database Tables"

image

Setting Cloudera manager option referer_check: false in definition.yml is ignored

Hi,

i set the following config parameter for cloudera manager via definition.yml but refer_check is ignored for any reason.

cloudera_manager_options:
krb_auth_enable: true
auth_backend_order: "DB_THEN_LDAP"
ldap_type: PAM
proxyuser_knox_groups: ""
proxyuser_knox_hosts: "
"
proxyuser_knox_users: "*"
referer_check: false

Has anybody an idea why ?

Best regards

Uli

Navigator to Atlas migration to be optional

Actually, navigator to atlas migration is part of the execution, when upgrading from CDH to CDP.
While it's fine for labs, it can be very very long in production. IHAC with 40 millions entities in Navigator in production. If all are transitioned and if we believe this documentation, it might take multiple days

We need to find a way to postpone this migration

Cryptography backend can only use "auto" for cipher option.

When using CA role. Play fails with the following error :
TASK [cloudera.cluster.ca_server : Generate root private key] **************************************************************************************************************************
fatal: [xxxx]: FAILED! => changed=false
msg: Cryptography backend can only use "auto" for cipher option.

When looking at openssl_privatekey_module's documentation we can see that only "auto" value is accepted for cipher parameter

Spark 3 services depencies

Got this error when installing SPARK 3 on CDP 7.1.7 / CM 7.6.5 :
Unknown configuration attribute 'hive_service' for service (type: 'SPARK3_ON_YARN', name: 'spark3_on_yarn').

Unwanted SDX deployment

when running the cloudera deploy playbook (tags=default_cluster,kerberos), in the last tasks , i can see this, I also have added when it is triggered, from:

/opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/deployment/cluster/tasks/main.yml

TASK [cloudera.cluster.cluster : Create base cluster data contexts (SDX)]

 when:
    - cluster.type | default(default_cluster_type) == 'base'
    - cloudera_manager_version is version('6.2.0','>=')

when running the playbook twice (for example with unset proxy, import cluster template fails when downloading parcels, after manual proxy set-up in the CM UI, playbook deploys cluster successfully on the second run, but with NO SDX!)

Cluster is always deployed successfully, in the green state, however, when I click on the Basic Cluster in the CM, instead of CMS I am seeing SDX which is not configured...

Task "Generate cluster template file" fail

The task "Generate cluster template file" fail

#####Command line########
ansible-playbook /opt/cloudera-deploy/main.yml -i /opt/cloudera-deploy/cdppredev/inventory_static.ini -e "profile=/opt/cloudera-deploy/cdppredev/profile.yml" -e "definition_path=/opt/cloudera-deploy/cdppredev" -e "definition_file=definition.yml" -e "cluster_file=cluster.yml" -t full_cluster

#########Console error#########
TASK [cloudera.cluster.cluster : Generate cluster template file] ****************************************************************************************
task path: /opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/deployment/cluster/tasks/create_base.yml:27
Thursday 27 May 2021 13:26:31 +0000 (0:00:00.060) 0:00:50.089 **********
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "echo /root/.ansible/tmp"&& mkdir "echo /root/.ansible/tmp/ansible-tmp-1622121991.4248302-17965-148244685479104" && echo ansible-tmp-1622121991.4248302-17965-148244685479104="echo /root/.ansible/tmp/ansible-tmp-1622121991.4248302-17965-148244685479104" ) && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1622121991.4248302-17965-148244685479104/ > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {
"changed": false,
"msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'bdengd704.cdadev.mydomain.com'"
}

#####inventory_static.ini####
[cloudera_manager]
bdengd002.cdadev.mydomain.com

[cluster_worker_nodes]
bdengd704.cdadev.mydomain.com

[cluster_worker_nodes:vars]
host_template=Workers

[cluster_master_nodes]
bdengd002.cdadev.mydomain.com host_template=Master1

[cluster:children]
cluster_master_nodes
cluster_worker_nodes

[db_server]
bdengd002.cdadev.mydomain.com

[deployment:children]
cluster
db_server

[deployment:vars]
ansible_user=root
ansible_ssh_private_key_file=~/.ssh/cldr_ssh_rsa

#########cluster.yml########

cloudera_manager_version: 7.1.5.0

clusters:

  • name: CDPPREDEV
    services: [HDFS, YARN, ZOOKEEPER]
    repositories:
    • https://bdadmp01.zit.mydomain.com/cdp/cloudera-repos/cdh7/7.1.5.0/parcels/
      configs:
      HDFS:
      DATANODE:
      dfs_data_dir_list: /dfs/dn
      NAMENODE:
      dfs_name_dir_list: /dfs/nn
      SECONDARYNAMENODE:
      fs_checkpoint_dir_list: /dfs/snn
      YARN:
      RESOURCEMANAGER:
      yarn_scheduler_maximum_allocation_mb: 4096
      yarn_scheduler_maximum_allocation_vcores: 4
      NODEMANAGER:
      yarn_nodemanager_resource_memory_mb: 4096
      yarn_nodemanager_resource_cpu_vcores: 4
      yarn_nodemanager_local_dirs: /tmp/nm
      yarn_nodemanager_log_dirs: /var/log/nm
      GATEWAY:
      mapred_submit_replication: 3
      mapred_reduce_tasks: 6
      ZOOKEEPER:
      SERVICEWIDE:
      zookeeper_datadir_autocreate: true
      host_templates:
      Master1:
      HDFS: [NAMENODE, SECONDARYNAMENODE, HTTPFS]
      YARN: [RESOURCEMANAGER, JOBHISTORY]
      ZOOKEEPER: [SERVER]
      Workers:
      HDFS: [DATANODE]
      YARN: [NODEMANAGER]

mgmt:
name: Cloudera Management Service
services: [ALERTPUBLISHER, EVENTSERVER, HOSTMONITOR, REPORTSMANAGER, SERVICEMONITOR]

hosts:
configs:
host_default_proc_memswap_thresholds:
warning: never
critical: never
host_memswap_thresholds:
warning: never
critical: never
host_config_suppression_agent_system_user_group_validator: true

####profile.yml######
admin_password: "admin"
infra_type: "onpremise"

####definition.yml#####
datahub:
definitions:
- include: "datahub_streams_messaging_light.j2"

use_default_cluster_definition: no
use_download_mirror: no
preload_cm_parcel_repo: yes

###main.yml#####
I'm using cloudera-deploy repo

####cloudera runner docker v1.0.2####
I'm using docker container generated by quickstart.sh from cloudera-deploy repo

####My personal observations#######

If in inventory_static.ini I use the same host for Worker and Master, this task pass and the template start to be deployed to cloudera manager

Improve validation around CM agent install/liveness to prevent opaque failures in later steps

Currently, a failure in installing/starting the CM agent does not prevent the playbook from continuing, which causes issues later as the host name is not present in the list of names returned from the All Hosts page, resulting in the common error

"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute '<hostname>'" }

This error is not at all clear on what or why it failed.

We should probably add some extra validation around this & produce more meaningful errors & failures

Such as

  • Validate that CM agent is installed
  • Validate its heartbeating
  • Validate the All Hosts list against the cluster:children investory

Thoughts?

spark3 configuration didn't works

hi,
i run ansible playbook installation with tag - full cluster
the definition.yml. is use with this configuration -
clusters:

  • name: Spark 3
    services: [HDFS, QUEUEMANAGER, SPARK3_ON_YARN, YARN, ZOOKEEPER]

    but get this error -
    fatal: []: FAILED! => {
    "assertion": false,
    "changed": false,
    "evaluated_to": false,
    "msg": "Unknown service(s) ['SPARK3_ON_YARN'] defined in cluster 'Spark 3'"
    }

thanks,
hod

Reference Playbooks and Example Definitions

Hello,

You say on your readme that :

This Ansible Collection is designed to work hand-in-hand with Cloudera Deploy, which contains reference Playbooks and 
example Definitions.

But the cloudera deploy repository is not public, can you put it public if possible or add some reference playbooks or exapmes to this repository ?

Regards,

Moncer

Ranger web interface check failing when deploying KMS / KTS

File cloudera.cluster/roles/operations/refresh_ranger_kms_repo/tasks/cluster_find_ranger.yml contained a less than ideal check for determining Ranger interface of https/http.

Original Code block lines 70 - 95 :

- name: Check if HTTPS is used
  uri:
    url: "https://{{ _ranger_host }}:{{ _ranger_https_port }}"
    status_code:
      - 401
      - -1
  register: _ranger_https_resp
  changed_when: false

- set_fact:
    _ranger_https_used: "{{ _ranger_https_resp.status == 401 }}"

- name: Check if HTTP is used
  uri:
    url: "http://{{ _ranger_host }}:{{ _ranger_http_port }}"
    status_code:
      - 401
      - -1
  register: _ranger_http_resp
  changed_when: false
  when: not _ranger_https_used

- set_fact:
    _ranger_http_used: "{{ not _ranger_https_used and _ranger_http_used.status | default(omit) == 401 }}"

Changed to New code block with Client:

- name: Check if HTTPS is used
  uri:
    url: "https://{{ _ranger_host }}:{{ _ranger_https_port }}"
    status_code:
      - 401
      - -1
  register: _ranger_https_resp
  changed_when: false

- name: Check if HTTP is used
  uri:
    url: "http://{{ _ranger_host }}:{{ _ranger_http_port }}"
    status_code:
      - 401
      - -1
  register: _ranger_http_resp
  changed_when: false

- set_fact:
    _ranger_https_used: "{{ ( _ranger_https_resp.status == 401 ) | bool }}"
    _ranger_http_used: "{{ ( _ranger_http_resp.status == 401 ) | bool }}"

Also attaching full file we patched to make work. File may need to be rewritten, but this was the on the fly patch.
cluster_find_ranger.yml.zip

CDH to CDP Upgrade : YARN Queues are not migrated

When upgrading from CDH 6.3.4 to CDP 7.1.7 SP2 the source Fair Scheduler configuration is not migrated.

Symptom: the CDP Capacity scheduler is left as the default one
Cause : in the fs2cs script fair-scheduler.xml relative path is specified while it must be full path
Solution : Edit the fs2cs.j2 template that generate that script in order to specify the full path

Cannot deploy with AutoTLS

Unable to restart service cloudera-scm-server when deploying cluster with autotls

Hello, when I am deploying cluster without security: tls in definition.yml in both mgmt and basic cluster sections. and without tls=True in the inventory file. like it is mentioned in this documentation.
Without these and playbook tag autotls, cluster is deployed successfully, after that, manual autotls enablement is functional with both root and nonroot user

I have tried all mentioned above, with setting autotls user in this file

But I am always getting this error.

TASK [cloudera.cluster.autotls : Restart Cloudera Manager Server] **************
Wednesday 25 January 2023  08:16:52 +0000 (0:00:03.192)       0:12:32.683 *****
fatal: [myhost1.domain.com]: FAILED! => {"changed": false, "msg": "Unable to restart service cloudera-scm-server: Failed to restart cloudera-scm-server.service: Connection timed out\nSee system logs and 'systemctl status cloudera-scm-server.service' for details.\n"}
$ systemctl status cloudera-scm-server.service

● cloudera-scm-server.service - Cloudera CM Server Service
   Loaded: loaded (/usr/lib/systemd/system/cloudera-scm-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2023-01-25 09:13:39 CET; 15min ago
 Main PID: 60455 (java)
    Tasks: 109
   Memory: 2.5G
   CGroup: /system.slice/cloudera-scm-server.service
           └─60455 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64/bin/java -cp .:/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-...

Jan 25 09:13:39 myhost1 systemd[1]: Starting Cloudera CM Server Service...
Jan 25 09:13:39 myhost1 systemd[1]: Started Cloudera CM Server Service.
Jan 25 09:13:39 myhost1 cm-server[60455]: JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64
Jan 25 09:13:39 myhost1 cm-server[60455]: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Jan 25 09:13:41 myhost1 cm-server[60455]: ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the con...n logging.
Jan 25 09:13:45 myhost1 cm-server[60455]: 09:13:45.471 [main] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: relation "cm_version" does not exist
Jan 25 09:13:45 myhost1 cm-server[60455]: Position: 21
Hint: Some lines were ellipsized, use -l to show in full.

Also checked logs from /var/log/cloudera-scm-server/cloudera-scm-server.log

2023-01-25 09:16:52,245 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Persisting new CMCA to database
2023-01-25 09:16:52,252 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Configuring CM to turn on Auto-TLS
2023-01-25 09:16:52,254 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AGENT_TLS
2023-01-25 09:16:52,259 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: WEB_TLS
2023-01-25 09:16:52,261 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: NEED_AGENT_VALIDATION
2023-01-25 09:16:52,263 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: KEYSTORE_PATH
2023-01-25 09:16:52,265 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: KEYSTORE_PASSWORD
2023-01-25 09:16:52,267 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: TRUSTSTORE_PATH
2023-01-25 09:16:52,269 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: TRUSTSTORE_PASSWORD
2023-01-25 09:16:52,271 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: HOST_CERT_GENERATOR
2023-01-25 09:16:52,274 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: SSL_CERTIFICATE_HOSTNAME
2023-01-25 09:16:52,276 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AUTO_TLS_KEYSTORE_PASSWORD
2023-01-25 09:16:52,278 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AUTO_TLS_TRUSTSTORE_PASSWORD
2023-01-25 09:16:52,280 INFO scm-web-107:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AUTO_TLS_TYPE
2023-01-25 09:16:52,282 INFO scm-web-107:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333793 work: Configure the services on this cluster for Auto-TLS.
2023-01-25 09:16:52,282 INFO scm-web-107:com.cloudera.cmf.command.ConfigureAutoTlsServicesCmdWork: Configuring existing services to use Auto-TLS
2023-01-25 09:16:52,284 INFO scm-web-107:com.cloudera.cmf.model.DbCommand: Command 1546333793(GenerateCMCACommand) has completed. finalstate:FINISHED, success:true, msg:Suc
cessfully generated CMCA and enabled Auto-TLS
2023-01-25 09:16:52,286 INFO scm-web-107:com.cloudera.cmf.service.ServiceHandlerRegistry: Global Command GenerateCMCACommand launched with id=1546333793
2023-01-25 09:16:52,347 INFO scm-web-107:com.cloudera.cmf.service.ServiceHandlerRegistry: Executing Global command ProcessStalenessCheckCommand BasicCmdArgs{args=[First reason why: com.cloudera.cmf.model.DbConfigContainer.configsForDb (#2) has changed]}.
2023-01-25 09:16:52,347 INFO scm-web-107:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333810 work: Execute 1 steps in sequence
2023-01-25 09:16:52,347 INFO scm-web-107:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333810 work: Configuration Staleness Check
2023-01-25 09:16:52,347 INFO scm-web-107:com.cloudera.cmf.service.ServiceHandlerRegistry: Global Command ProcessStalenessCheckCommand launched with id=1546333810
2023-01-25 09:16:52,355 INFO CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Acquired lease lock on DbCommand:1546333810
2023-01-25 09:16:52,361 INFO ProcessStalenessDetector-0:com.cloudera.cmf.service.config.components.ProcessStalenessDetector: Queuing staleness check with FULL_CHECK for 0/0 roles.
2023-01-25 09:16:52,361 INFO ProcessStalenessDetector-0:com.cloudera.cmf.service.config.components.ProcessStalenessDetector: Staleness check done. Duration: PT0.001S
2023-01-25 09:16:52,361 INFO ProcessStalenessDetector-0:com.cloudera.cmf.service.config.components.ProcessStalenessDetector: Staleness check execution stats: average=0ms, min=0ms, max=0ms.
2023-01-25 09:16:52,365 INFO CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Acquired lease lock on DbCommand:1546333810
2023-01-25 09:16:52,365 INFO scm-web-107:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/v45/cm/commands/generateCmca, Status:200
2023-01-25 09:16:52,369 INFO CommandPusher-1:com.cloudera.cmf.model.DbCommand: Command 1546333810(ProcessStalenessCheckCommand) has completed. finalstate:FINISHED, success:true, msg:Successfully finished checking for configuration staleness.
2023-01-25 09:16:52,369 INFO CommandPusher-1:com.cloudera.cmf.command.components.CommandStorage: Invoked delete temp files for command:DbCommand{id=1546333810, name=ProcessStalenessCheckCommand} at dir:/var/lib/cloudera-scm-server/temp/commands/1546333810
2023-01-25 09:17:39,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:17:40,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:18:41,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:18:42,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:19:43,914 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:19:44,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:20:45,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:20:46,780 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:21:46,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (29 skipped) Synced up
2023-01-25 09:21:47,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:22:48,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:22:49,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:23:50,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:23:51,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:24:43,790 INFO StaleEntityEviction:com.cloudera.server.cmf.StaleEntityEvictionThread: Reaped total of 0 deleted commands
2023-01-25 09:24:43,804 INFO StaleEntityEviction:com.cloudera.server.cmf.StaleEntityEvictionThread: Found no commands older than 2021-01-25T08:24:43.790Z to reap.
2023-01-25 09:24:43,804 INFO StaleEntityEviction:com.cloudera.server.cmf.StaleEntityEvictionThread: Wizard is active, not reaping scanners or configurators
2023-01-25 09:24:52,780 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:24:53,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:25:54,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:25:55,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:26:54,781 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (29 skipped) Synced up
2023-01-25 09:26:57,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:27:56,780 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:27:59,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:28:58,780 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:29:01,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 09:30:00,780 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 09:30:03,779 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up

when logging in CM web UI, I am able to see Add Private Cloud Base Cluster sort of wizard, with:

AutoTLS has already been enabled.
A KDC is currently not configured. This means you cannot create Kerberized clusters.

in /cmf/home there is no cluster added.

when running the same configuration, but using both autotls,tls tags, playbook fails with different error:

TASK [cloudera.cluster.autotls : Enable Auto-TLS] ******************************
Wednesday 25 January 2023  09:21:00 +0000 (0:00:00.217)       0:16:03.749 *****
fatal: [myhost1.domain.com]: FAILED! => {"cache_control": "no-cache, no-store, max-age=0, must-revalidate", "changed": false, "connection": "close", "content": "{\n  \"id\" : 1546333829,\n  \"name\" : \"GenerateCMCACommand\",\n  \"startTime\" : \"2023-01-25T09:21:01.475Z\",\n  \"endTime\" : \"2023-01-25T09:21:16.212Z\",\n  \"active\" : false,\n  \"success\" : false,\n  \"resultMessage\" : \"Failed to enable Auto-TLS\",\n  \"children\" : {\n    \"items\" : [ ]\n  }\n}", "content_type": "application/json;charset=utf-8", "cookies": {"SESSION": "5c861199-d6a7-4084-8ca3-e7fa716d8c08"}, "cookies_string": "SESSION=5c861199-d6a7-4084-8ca3-e7fa716d8c08", "date": "Wed, 25 Jan 2023 09:21:16 GMT", "elapsed": 14, "expires": "Thu, 01 Jan 1970 00:00:00 GMT", "json": {"active": false, "children": {"items": []}, "endTime": "2023-01-25T09:21:16.212Z", "id": 1546333829, "name": "GenerateCMCACommand", "resultMessage": "Failed to enable Auto-TLS", "startTime": "2023-01-25T09:21:01.475Z", "success": false}, "msg": "OK (unknown bytes)", "pragma": "no-cache", "redirected": false, "set_cookie": "SESSION=5c861199-d6a7-4084-8ca3-e7fa716d8c08; Path=/; Secure; HttpOnly", "status": 200, "strict_transport_security": "max-age=31536000 ; includeSubDomains", "url": "https://myhost1.domain.com:7183/api/v45/cm/commands/generateCmca", "x_content_type_options": "nosniff", "x_frame_options": "DENY", "x_xss_protection": "1; mode=block"}

cloudera-scm-server status

● cloudera-scm-server.service - Cloudera CM Server Service
   Loaded: loaded (/usr/lib/systemd/system/cloudera-scm-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2023-01-25 10:19:25 CET; 8min ago
  Process: 126100 ExecStartPre=/opt/cloudera/cm/bin/cm-server-pre (code=exited, status=0/SUCCESS)
 Main PID: 126105 (java)
    Tasks: 137
   Memory: 2.5G
   CGroup: /system.slice/cloudera-scm-server.service
           └─126105 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64/bin/java -cp .:/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector...

Jan 25 10:19:25 myhost1 systemd[1]: Starting Cloudera CM Server Service...
Jan 25 10:19:25 myhost1 systemd[1]: Started Cloudera CM Server Service.
Jan 25 10:19:25 myhost1 cm-server[126105]: JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64
Jan 25 10:19:25 myhost1 cm-server[126105]: OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Jan 25 10:19:26 myhost1 cm-server[126105]: ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the co...n logging.
Hint: Some lines were ellipsized, use -l to show in full.

some interesting logs from /var/cloudera-scm-server/cloudera-scm-server.log

2023-01-25 10:20:53,359 INFO scm-web-112:com.cloudera.enterprise.JavaMelodyFacade: Entering HTTP Operation: Method:PUT, Path:/v45/users/admin
2023-01-25 10:20:53,411 INFO scm-web-112:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:PUT, Path:/v45/users/admin, Status:200
2023-01-25 10:20:56,186 INFO scm-web-115:com.cloudera.server.web.cmf.AuthenticationFailureEventListener: Authentication failure for user: 'admin' from 53.250.49.126
2023-01-25 10:20:58,873 INFO scm-web-128:com.cloudera.server.web.cmf.AuthenticationFailureEventListener: Authentication failure for user: 'admin' from 53.250.49.126
2023-01-25 10:20:59,750 INFO scm-web-104:com.cloudera.server.web.cmf.AuthenticationSuccessEventListener: Authentication success for user: 'admin' from 53.250.49.126
2023-01-25 10:21:00,081 INFO scm-web-105:com.cloudera.server.web.cmf.AuthenticationSuccessEventListener: Authentication success for user: 'admin' from 53.250.49.126
2023-01-25 10:21:01,042 INFO scm-web-110:com.cloudera.server.web.cmf.AuthenticationSuccessEventListener: Authentication success for user: 'admin' from 53.250.49.126
2023-01-25 10:21:01,412 INFO scm-web-119:com.cloudera.server.web.cmf.AuthenticationSuccessEventListener: Authentication success for user: 'admin' from 53.250.49.126
2023-01-25 10:21:01,416 INFO scm-web-119:com.cloudera.enterprise.JavaMelodyFacade: Entering HTTP Operation: Method:POST, Path:/v45/cm/commands/generateCmca
2023-01-25 10:21:01,465 INFO scm-web-119:com.cloudera.cmf.service.ServiceHandlerRegistry: Executing Global command GenerateCMCACommand GenerateCmcaCmdArgs{sshPort=22, userN
ame=root, password=REDACTED, passphrase=REDACTED, privateKey=REDACTED, customCA=false, interpretAsFilenames=true, additionalArguments=null, location=}.
2023-01-25 10:21:01,478 INFO scm-web-119:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333829 work: Execute 7 steps in sequence
2023-01-25 10:21:01,479 INFO scm-web-119:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333829 work: Generate a CMCA and enable Auto-TLS.
2023-01-25 10:21:01,487 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Skip disabling init file as host certificate generator was not generate_host_cert
2023-01-25 10:21:01,487 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Storing CMCA in database for HA
2023-01-25 10:21:01,487 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Creating temporary directory for CA generation.
2023-01-25 10:21:01,488 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Generating CMCA
2023-01-25 10:21:01,490 INFO scm-web-119:com.cloudera.cmf.command.CertmanagerRunner: Running CMCA command with args: [setup, --rotate, --configure-services, --skip-cm-init,
 --override, keystore_type=jks]
2023-01-25 10:21:03,076 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Persisting new CMCA to database
2023-01-25 10:21:03,081 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Configuring CM to turn on Auto-TLS
2023-01-25 10:21:03,083 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AGENT_TLS
2023-01-25 10:21:03,083 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: WEB_TLS
2023-01-25 10:21:03,083 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: NEED_AGENT_VALIDATION
2023-01-25 10:21:03,083 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: KEYSTORE_PATH
2023-01-25 10:21:03,084 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: KEYSTORE_PASSWORD
2023-01-25 10:21:03,084 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: TRUSTSTORE_PATH
2023-01-25 10:21:03,084 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: TRUSTSTORE_PASSWORD
2023-01-25 10:21:03,084 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: HOST_CERT_GENERATOR
2023-01-25 10:21:03,092 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: SSL_CERTIFICATE_HOSTNAME
2023-01-25 10:21:03,096 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AUTO_TLS_KEYSTORE_PASSWORD
2023-01-25 10:21:03,098 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AUTO_TLS_TRUSTSTORE_PASSWORD
2023-01-25 10:21:03,101 INFO scm-web-119:com.cloudera.cmf.command.GenerateCmcaCmdWork: Setting TLS configuration: AUTO_TLS_TYPE
2023-01-25 10:21:03,105 INFO scm-web-119:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333829 work: Generates TLS keys and certificates for a host and instal
l them using SSH
2023-01-25 10:21:03,105 INFO scm-web-119:com.cloudera.cmf.command.GenerateHostCertsCmdWork: Generating host certs for host: myhost1.domain.com
2023-01-25 10:21:03,117 INFO scm-web-119:com.cloudera.cmf.command.GenerateHostCertsCmdWork: Using host certificate generator command: {{TEMP_DIR}}
2023-01-25 10:21:03,117 INFO scm-web-119:com.cloudera.server.cmf.node.HostCertConfigurator: Creating temporary directory for certificate generation.
2023-01-25 10:21:03,126 INFO scm-web-119:com.cloudera.server.cmf.node.HostCertConfigurator: Using host certificate generator command: /opt/cloudera/cm-agent/bin/certmanager
 --location /tmp/generateHostCerts583464968626382515 gen_node_cert --output=-
2023-01-25 10:21:04,451 INFO scm-web-119:net.schmizz.sshj.common.SecurityUtils: BouncyCastle already registered as a JCE provider
2023-01-25 10:21:04,527 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Client identity string: SSH-2.0-SSHJ_0_14_0
2023-01-25 10:21:04,538 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Server identity string: SSH-2.0-OpenSSH_7.4
2023-01-25 10:21:06,975 WARN scm-web-119:com.cloudera.server.cmf.node.SSHConfigurator: Could not authenticate to myhost1.domain.com
net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods

2023-01-25 10:21:06,977 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Disconnected - BY_APPLICATION
2023-01-25 10:21:06,978 WARN scm-web-119:com.cloudera.cmf.command.GenerateHostCertsCmdWork: Error generating certificates. Retrying in 2000 ms.
2023-01-25 10:21:08,979 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Client identity string: SSH-2.0-SSHJ_0_14_0
2023-01-25 10:21:08,996 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Server identity string: SSH-2.0-OpenSSH_7.4
2023-01-25 10:21:10,917 WARN scm-web-119:com.cloudera.server.cmf.node.SSHConfigurator: Could not authenticate to myhost1.domain.com
net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods


2023-01-25 10:21:10,919 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Disconnected - BY_APPLICATION
2023-01-25 10:21:10,920 WARN scm-web-119:com.cloudera.cmf.command.GenerateHostCertsCmdWork: Error generating certificates. Retrying in 3000 ms.
2023-01-25 10:21:13,921 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Client identity string: SSH-2.0-SSHJ_0_14_0
2023-01-25 10:21:13,936 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Server identity string: SSH-2.0-OpenSSH_7.4
2023-01-25 10:21:15,271 INFO LDAP Login Monitor thread:com.cloudera.cmf.service.auth.AbstractExternalServerLoginMonitor: LDAP monitoring is disabled.
2023-01-25 10:21:15,272 INFO KDC Login Monitor thread:com.cloudera.cmf.service.auth.AbstractExternalServerLoginMonitor: KDC monitoring is disabled.
2023-01-25 10:21:16,204 WARN scm-web-119:com.cloudera.server.cmf.node.SSHConfigurator: Could not authenticate to myhost1.domain.com
net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods


2023-01-25 10:21:16,206 INFO scm-web-119:net.schmizz.sshj.transport.TransportImpl: Disconnected - BY_APPLICATION
2023-01-25 10:21:16,212 ERROR scm-web-119:com.cloudera.cmf.command.GenerateHostCertsCmdWork: Error generating certificates: java.lang.IllegalStateException: Not authenticat
ed

2023-01-25 10:21:16,212 ERROR scm-web-119:com.cloudera.cmf.command.flow.WorkOutputs: CMD id: 1546333829 Failed to generate and install host certificates
2023-01-25 10:21:16,212 ERROR scm-web-119:com.cloudera.cmf.model.DbCommand: Command 1546333829(GenerateCMCACommand) has completed. finalstate:FINISHED, success:false, msg:F
ailed to enable Auto-TLS
2023-01-25 10:21:16,218 INFO scm-web-119:com.cloudera.cmf.service.ServiceHandlerRegistry: Global Command GenerateCMCACommand launched with id=1546333829
2023-01-25 10:21:16,241 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Cleaned up
2023-01-25 10:21:16,263 INFO scm-web-119:com.cloudera.cmf.service.ServiceHandlerRegistry: Executing Global command ProcessStalenessCheckCommand BasicCmdArgs{args=[First rea
son why: com.cloudera.cmf.model.DbConfig.valueForDb (#1546333786) has changed]}.
2023-01-25 10:21:16,264 INFO scm-web-119:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333839 work: Execute 1 steps in sequence
2023-01-25 10:21:16,264 INFO scm-web-119:com.cloudera.cmf.command.flow.CmdStep: Executing command 1546333839 work: Configuration Staleness Check
2023-01-25 10:21:16,264 INFO scm-web-119:com.cloudera.cmf.service.ServiceHandlerRegistry: Global Command ProcessStalenessCheckCommand launched with id=1546333839
2023-01-25 10:21:16,275 INFO CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Acquired lease lock on DbCommand:1546333839
2023-01-25 10:21:16,281 INFO ProcessStalenessDetector-0:com.cloudera.cmf.service.config.components.ProcessStalenessDetector: Queuing staleness check with FULL_CHECK for 0/0 roles.
2023-01-25 10:21:16,282 INFO ProcessStalenessDetector-0:com.cloudera.cmf.service.config.components.ProcessStalenessDetector: Staleness check done. Duration: PT0.001S
2023-01-25 10:21:16,282 INFO ProcessStalenessDetector-0:com.cloudera.cmf.service.config.components.ProcessStalenessDetector: Staleness check execution stats: average=0ms, min=0ms, max=0ms.
2023-01-25 10:21:16,287 INFO CommandPusher-1:com.cloudera.server.cmf.CommandPusherThread: Acquired lease lock on DbCommand:1546333839
2023-01-25 10:21:16,289 INFO scm-web-119:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/v45/cm/commands/generateCmca, Status:200
2023-01-25 10:21:16,294 INFO CommandPusher-1:com.cloudera.cmf.model.DbCommand: Command 1546333839(ProcessStalenessCheckCommand) has completed. finalstate:FINISHED, success:true, msg:Successfully finished checking for configuration staleness.
2023-01-25 10:21:16,295 INFO CommandPusher-1:com.cloudera.cmf.command.components.CommandStorage: Invoked delete temp files for command:DbCommand{id=1546333839, name=ProcessStalenessCheckCommand} at dir:/var/lib/cloudera-scm-server/temp/commands/1546333839
2023-01-25 10:21:17,244 INFO pool-6-thread-1:com.cloudera.server.cmf.components.CmServerStateSynchronizer: (30 skipped) Synced up
2023-01-25 10:21:50,642 INFO avro-servlet-hb-processor-3:com.cloudera.server.common.AgentAvroServlet: (25 skipped) AgentAvroServlet: heartbeat processing stats: average=21ms, min=4ms, max=155ms.
2023-01-25 10:57:24,606 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
2023-01-25 10:59:22,544 ERROR main:com.cloudera.server.cmf.bootstrap.EntityManagerFactoryBean: Could not read license file /etc/cloudera-scm-server/license.txt
2023-01-25 11:00:10,064 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
2023-01-25 11:00:12,183 WARN MainThread:org.eclipse.jetty.security.SecurityHandler: [email protected]@fa85d63{/,null,STARTING} has uncovered http methods for path: /*
2023-01-25 11:00:12,399 ERROR MainThread:com.cloudera.enterprise.TLSUtil: Could not determine if current JDK can perform secure SSL/TLS renegotiation. Defaulting to no-renegotiations.
2023-01-25 11:00:12,514 WARN WebServerImpl:org.eclipse.jetty.security.SecurityHandler: [email protected]@2c9d79fb{/,file:///opt/cloudera/cm/webapp/,STARTING}{/opt/cloudera/cm/webapp} has uncovered http methods for path: /*

I have also tried putting whole private key file content into variable host_ssh_private_key created in

/opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/cloudera_manager/autotls/defaults/main.yml

https://github.com/cloudera-labs/cloudera.cluster/blob/main/roles/cloudera_manager/autotls/defaults/main.yml

and used this variable in this file

/opt/cldr-runner/collections/ansible_collections/cloudera/cluster/roles/cloudera_manager/autotls/templates/request.j2

https://github.com/cloudera-labs/cloudera.cluster/blob/main/roles/cloudera_manager/autotls/templates/request.j2

Private key content had to be as one-line with \\n instead of newlines.

when running with tags default_cluster,kerberos,autotls,tls, with tls=true in inventory_static.ini and tls: true
in security section of cluster/mgmt cluster definitions. got the following error:

TASK [cloudera.cluster.autotls : Enable Auto-TLS] ******************************
Friday 27 January 2023  14:40:10 +0000 (0:00:00.095)       0:15:03.414 ********
fatal: [myhost1.domain.com]: FAILED! => {"cache_control": "no-cache, no-store, max-age=0, must-revalidate", "changed": false, "connection": "close", "content": "{\n  \"id\" : 1546333829,\n  \"name\" : \"GenerateCMCACommand\",\n  \"startTime\" : \"2023-01-27T14:40:10.982Z\",\n  \"endTime\" : \"2023-01-27T14:40:19.660Z\",\n  \"active\" : false,\n  \"success\" : false,\n  \"resultMessage\" : \"Failed to enable Auto-TLS\",\n  \"children\" : {\n    \"items\" : [ ]\n  }\n}", "content_type": "application/json;charset=utf-8", "cookies": {"SESSION": "698ea13f-400c-4f63-aa9c-b69f6efd2cf4"}, "cookies_string": "SESSION=698ea13f-400c-4f63-aa9c-b69f6efd2cf4", "date": "Fri, 27 Jan 2023 14:40:19 GMT", "elapsed": 8, "expires": "Thu, 01 Jan 1970 00:00:00 GMT", "json": {"active": false, "children": {"items": []}, "endTime": "2023-01-27T14:40:19.660Z", "id": 1546333829, "name": "GenerateCMCACommand", "resultMessage": "Failed to enable Auto-TLS", "startTime": "2023-01-27T14:40:10.982Z", "success": false}, "msg": "OK (unknown bytes)", "pragma": "no-cache", "redirected": false, "set_cookie": "SESSION=698ea13f-400c-4f63-aa9c-b69f6efd2cf4; Path=/; Secure; HttpOnly", "status": 200, "strict_transport_security": "max-age=31536000 ; includeSubDomains", "url": "https://myhost1.domain.com:7183/api/v45/cm/commands/generateCmca", "x_content_type_options": "nosniff", "x_frame_options": "DENY", "x_xss_protection": "1; mode=block"}

logs:

2023-01-27 15:40:19,652 WARN scm-web-114:com.cloudera.server.cmf.node.SSHConfigurator: Could not authenticate to myhost1.domain.com
net.schmizz.sshj.userauth.UserAuthException: Exhausted available authentication methods


Caused by: net.schmizz.sshj.userauth.UserAuthException: Problem getting public key from PKCS8KeyFile{resource=[PrivateKeyStringResource]}


Caused by: java.io.IOException: unrecognised object: OPENSSH PRIVATE KEY



2023-01-27 15:40:19,654 INFO scm-web-114:net.schmizz.sshj.transport.TransportImpl: Disconnected - BY_APPLICATION
2023-01-27 15:40:19,660 ERROR scm-web-114:com.cloudera.cmf.command.GenerateHostCertsCmdWork: Error generating certificates: java.lang.IllegalStateException: Not authenticat
ed

2023-01-27 15:40:19,660 ERROR scm-web-114:com.cloudera.cmf.command.flow.WorkOutputs: CMD id: 1546333829 Failed to generate and install host certificates
2023-01-27 15:40:19,660 ERROR scm-web-114:com.cloudera.cmf.model.DbCommand: Command 1546333829(GenerateCMCACommand) has completed. finalstate:FINISHED, success:false, msg:Failed to enable Auto-TLS

Caused by: java.io.IOException: unrecognised object: OPENSSH PRIVATE KEY indicates that CM somehow still cannot read the private key.

Hitting limit of number of ACLs with XFS file system on CentOS 7

Hello,

this issue happens with the v2 branch and CentOS 7 (7.9 to be specific, fully up to date).

While running task Add ACLs to keystore in file roles/security/tls_generate_csr/tasks/acls.yml, started from playbook prepare_tls.yml, which itself is started from site.yml, we are getting an error when adding an ACL to file /opt/cloudera/security/pki/HOSTNAME_REMOVED.jks.

The error is easily reproduceable without Ansible:

# /bin/setfacl -m group:zookeeper:r /opt/cloudera/security/pki/HOSTNAME_REMOVED.jks
setfacl: /opt/cloudera/security/pki/HOSTNAME_REMOVED.jks: Argument list too long

The reason seems to be that the underlying file system, which is XFS, does not support more than 21 ACIs. This is easily tested as follows.

># uname -r
3.10.0-1160.25.1.el7.x86_64
># pwd
/root
># mount|grep root
/dev/mapper/centos_root on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
># touch file
># for g in $(cut -d ':' -f 1 /etc/group) ; do echo "Adding group: ${g}" ; setfacl -m group:${g}:r file ; done
Adding group: root
Adding group: bin
Adding group: daemon
Adding group: sys
Adding group: adm
Adding group: tty
Adding group: disk
Adding group: lp
Adding group: mem
Adding group: kmem
Adding group: wheel
Adding group: cdrom
Adding group: mail
Adding group: man
Adding group: dialout
Adding group: floppy
Adding group: games
Adding group: tape
Adding group: video
Adding group: ftp
Adding group: lock
Adding group: audio
setfacl: file: Argument list too long
Adding group: nobody
setfacl: file: Argument list too long
Adding group: users
setfacl: file: Argument list too long
Adding group: utmp
setfacl: file: Argument list too long
Adding group: utempter
setfacl: file: Argument list too long
Adding group: ssh_keys
setfacl: file: Argument list too long
Adding group: avahi-autoipd
setfacl: file: Argument list too long
Adding group: input
setfacl: file: Argument list too long
Adding group: systemd-journal
setfacl: file: Argument list too long
Adding group: systemd-bus-proxy
setfacl: file: Argument list too long
Adding group: systemd-network
setfacl: file: Argument list too long
Adding group: dbus
setfacl: file: Argument list too long
Adding group: polkitd
setfacl: file: Argument list too long
Adding group: dip
setfacl: file: Argument list too long
Adding group: tss
setfacl: file: Argument list too long
Adding group: postdrop
setfacl: file: Argument list too long
Adding group: postfix
setfacl: file: Argument list too long
Adding group: chrony
setfacl: file: Argument list too long
Adding group: sshd
setfacl: file: Argument list too long
># getfacl file
# file: file
# owner: root
# group: root
user::rw-
group::r--
group:root:r--
group:bin:r--
group:daemon:r--
group:sys:r--
group:adm:r--
group:tty:r--
group:disk:r--
group:lp:r--
group:mem:r--
group:kmem:r--
group:wheel:r--
group:cdrom:r--
group:mail:r--
group:man:r--
group:dialout:r--
group:floppy:r--
group:games:r--
group:tape:r--
group:video:r--
group:ftp:r--
group:lock:r--
mask::r--
other::r--

In summary, adding to many ACLs to the certificate fails. As XFS is the default file system on CentOS 7 as far as I know, I wonder why this problem has not been seen as often as expected. Maybe there are more conditions to be met, in order to hit this problem.

A possible solution would be to create a group, add all those users like zookeeper to that group, and give read permissions to this group to file /opt/cloudera/security/pki/HOSTNAME_REMOVED.jks.

Add CM 7.11.3 / CDP 7.1.9 support

Using this collection for CM 7.11.3 / CDP 7.1.9 does not work
Some properties added implicitly by this collection do not exist

e.g. :

  • roles/config/cluster/base/templates/configs/logdirs-ranger-spooldirs.j2
    • ozone.log.dir
    • ranger_atlas_plugin_hdfs_audit_spool_directory
    • ranger_atlas_plugin_solr_audit_spool_directory
    • ranger_kafka_plugin_hdfs_audit_spool_directory
    • ranger_kafka_plugin_solr_audit_spool_directory
    • gateway_ranger_knox_plugin_hdfs_audit_spool_directory
    • gateway_ranger_knox_plugin_solr_audit_spool_directory

I'm working on creating a logdirs-ranger-spooldirs-7.1.9.j2 to fix this matter

Variable not getting set to defult

Although variable pvc_type is defined and given a default in roles/config/cluster/common/defaults/main.yml, when it is used in roles/infrastructure/krb5_server/tasks/freeipa.yml by default it fails with a variable not defined error.

In roles/infrastructure/krb5_client/tasks/freeipa.yml its checking to too if pvc_type is defined roles/infrastructure/krb5_client/tasks/freeipa.yml

Cluster build fails when using Ubuntu distribution

We are using these playbooks to deploy to CDP to a group of Ubuntu 18.04 servers. We have been having a good bit of trouble getting past the cluster template / cluster import steps in the deployment/cluster role.

These are the two tasks specifically.

- name: Generate cluster template file
  template:
    src: cluster_template/main.j2
    dest: /tmp/cluster_template_{{ cluster.name | replace(' ','_') }}.json
    mode: 0600
  #when: cluster_template_dry_run

- name: Import cluster template
  cloudera.cluster.cm_api:
    endpoint: /cm/importClusterTemplate?addRepositories=true
    method: POST
    body: "{{ lookup('template', 'cluster_template/main.j2', convert_data=False) }}"
  register: cluster_template_result
  ignore_errors: yes
  when: not cluster_template_dry_run

What we found was that the products entry in the generated cluster template file points to parcels that are not compatible with the Ubuntu distribution. Here is relevant snippet from the Ansible generated cluster template:

"repositories" : ["https://archive.cloudera.com/cdh7/7.1.7.0/parcels/"],
    "products" : [{"product": "CDH", "version": "7.1.7-1.cdh7.1.7.p0.15945976"}, {"product": "KEYTRUSTEE_SERVER", "version": "7.1.7.0-1.keytrustee7.1.7.0.p0.15945976"}],

If you browse to the following site, you'll see that there are no KEYTRUSTEE_SERVER parcels for either Ubuntu focal or bionic. This causes the cluster install to fail with an error on this parcel.

After digging, it looks like the filters.py program does not account for OS distribution and incorrectly includes the components in the products attribute.

We were able to get around this issue with the following modification to filters.py:

def extract_products_from_manifests(
        manifests, os_distribution: Optional[str] = None
    ):
        products = dict()
        for manifest in manifests:
            for parcel in manifest["parcels"]:
                # fetch the full parcel name from the manifest
                full_parcel_name = str(parcel["parcelName"])
                # the parcel OS distribution is between the last "-" and the ".parcel" extension
                parcel_os_distribution = full_parcel_name[
                    full_parcel_name.rindex("-")
                    + 1 : full_parcel_name.rindex(".parcel")
                ]
                # take first parcel, strip off OS name and file extension
                parcel_name = re.sub(r"-[a-z0-9]+\.parcel$", "", full_parcel_name)
                # the product name is before the first dash
                product = parcel_name[: parcel_name.index("-")]
                if product not in products and (
                    os_distribution == parcel_os_distribution or os_distribution is None
                ):
                    # the version string is everything after the first dash
                    version = parcel_name[parcel_name.index("-") + 1 :]
                    products[product] = version
        return products

The additions, by default will have no change. However, you can now add something like this to parcels.yml task file (deployment/repometa role):

- name: Extract product details from parcel manifests
  set_fact:
    products: >
      {{ manifests.results
      | map(attribute='json')
      | list
      | cloudera.cluster.extract_products_from_manifests(os_distribution=ansible_distribution_release)
      | dict2items(key_name='product', value_name='version')
      }}
  run_once: true

This should now exclude the parcel from the product list when there is no compatible version for the ansible_distribution_release.

Please let me know if there is an alternate way of handling this. If not, I can submit a pull request for the above changes.

Freeipa autodns mode - with user search filter

When using the freeipa autodns mode, the Cloudera Manager External Auth field for "LDAP User Search Filter" is being set to a ActiveDirectory type expression
Its being set to "(sAMAccountName={0})" but should be "(uid={0})"

In order to execute a seamless Base + PvC Control Plane + any DS install, this will need to be corrected, as the the CP gets this info from CM. and the DS's need LDAP working for its MagicSSO.

The confusing bit is that it looks like the CM settings are coming from:
https://github.com/cloudera-labs/cloudera.cluster/blob/main/roles/cloudera_manager/external_auth/templates/external_auth_configs.j2

and not at all from:
https://github.com/cloudera-labs/cloudera.cluster/blob/devel-pvc-update/roles/infrastructure/krb5_common/defaults/main.yml

Manual workarounds can be done, but this is actually an important area for proper automation (long term)

Database port typo

Cannot use the variable database_port or cloudera_manager_database_port

They are wrongly using database_type instead, and fallback to cloudera.cluster.default_database_port

Examples :

Documentation needs to be fixed too

Cloudera Manager API

HI, i want to add to the definition yml
External authentication (ldaps)
Please advise how to use the Cloudera Manager API to see how its writable in exists cluster and how to add it to the yml definition for new installation

No CMS config update

There is no possibility to update Cloudera Management Services. Only fresh install Management cluster works.

If you call mgmt tasks, it actually breaks CMS, by overriding databases and db users with default values, without updating the mgmt configs

TASK [cloudera.cluster.os : Disable unnecessary services] - fails due to service missing

Hi,
While deploying CDP Private Base (without -t flag in ansible-playbook command) an error occurred:

failed: [host1.example.com] (item=ip6tables) => {
    "ansible_loop_var": "item",
    "changed": false,
    "invocation": {
        "module_args": {
            "daemon_reexec": false,
            "daemon_reload": false,
            "enabled": false,
            "force": null,
            "masked": null,
            "name": "ip6tables",
            "no_block": false,
            "scope": "system",
            "state": "stopped"
        }
    },
    "item": "ip6tables",
    "msg": "Could not find the requested service ip6tables: host"
}

Task:
TASK [cloudera.cluster.os : Disable unnecessary services]

task path:
/root/.ansible/collections/ansible_collections/cloudera/cluster/roles/prereqs/os/tasks/main.yml

Temporary solution: (add line)
ignore_errors: yes

Permanent solution to implement:
Errors with services not present/not installed should be ignored.

Best regards,
Jacek Cieslak

Teardown db - does not teardown pgsql 12 on rhel8

/roles/teardown/tasks/teardown_database.yml

Task on line 50, "name: Delete database (postgres)" - does not teardown postgresql-12.service, service is still running
this causes terminal error from task on line 71 "name: Delete user (postgres)"

Observed failure on rhel 8 (not tested on rhel7, not tested on older pgsql on rhel8, not tested on mysql )

Add support for custom config groups in cluster update

We can configure custom config groups for new clusters in CDP Base but we cannot create new services with custom config groups in existing CDP Base clusters

The services templates used when creating cluster and updating cluster are different :

Behaviour when updating cluster: On new services, it will apply default config group to all hosts

Excepted behaviour: On new services, custom configuration groups should be applied

Exemple of use case: IHAC on CDH 6, upgrading to CDP 7 with this ansible collection. Their clusters have multiple configuration groups on hive for HMS, HS2, and Gateway roles. When upgrading, Hive on Tez is created as a new service with all instances with the default configuration group

Change in Enum of ApiClusterVersion, break ECS cluster creation

CM API v49 (CM7.6.5), & CM API v45 (CM7.5.5) there has been a change in ApiClusterVersion data type.
(Enum: CDH3, CDH3u4X, CDH4, CDH5, CDH6, CDH7, DATA_SERVICES1, UNKNOWN )

Previously, this had an element called EXPERIENCE1
The new value is DATA_SERVICES1

A template in cloudera.cluster references the old value, which breaks in CM7.6.5: & 7.5.5
https://github.com/cloudera-labs/cloudera.cluster/blob/devel/roles/deployment/cluster/templates/cluster_template/ecs/clusters.j2

Databases creation Issue

Databases creation behaviour changed between 3.3.0 & 3.4.0:

  • 3.3.0: all databases matching cluster services are created and password updated. Everything else is left untouched
  • 3.4.0: all default databases are created and password updated

I think this is due to this commit :

  • 3.3.0: loop: "{{ databases | intersect(services) }}"
  • 3.4.0: loop: "{{ databases }}"

Impact : When upgrading from CDH 6.3.2 to CDP 7.1.7
I put Ranger and not Sentry in cluster services
And I provide Ranger DB settings, and not Sentry DB settings.
It creates Ranger DB with correct username and password, but for Sentry (which I want to leave untouched), it changes the password to the default one

TASK [cloudera.cluster.kerberos : Import KDC admin credentials] -> failing

Problem:

TASK [cloudera.cluster.kerberos : Import KDC admin credentials] is failing

Investigation:

After some investigation I noticed "ad_kdc_domain" parameter was not set in CM when the task failed.
AD_KDC_DOMAIN" is set to null on TASK [cloudera.cluster.config : Filter out null configs if necessary]


Logs:

TASK [cloudera.cluster.config : Get existing configs]
{\n "name" : "AD_KDC_DOMAIN",\n "value" : "OU=CDP,DC=cdadev,DC=company,DC=com",\n "sensitive" : false\n }

TASK [cloudera.cluster.config : Filter out null configs if necessary]
ok: [xx.cdadev.company.com] => {"ansible_facts": {"filtered_configs": {"AD_DELETE_ON_REGENERATE": true, "AD_KDC_DOMAIN": null, "KDC_HOST": "sv242216.cdadev.company.com", "KDC_TYPE": "Active Directory", "KRB_ENC_TYPES": "aes256-cts aes128-cts rc4-hmac", "SECURITY_REALM": "CDADEV.company.COM"}}, "changed": false}


cluster.yml:

cloudera_manager_options:
custom_banner_html: "LAB-TEST"
custom_header_color: "BLUE"
krb_manage_krb5_conf: true
ad_delete_on_regenerate: true
ad_kdc_domain: "OU=CDP,DC=cdadev,DC=company,DC=com"


CM version l:

7.3.1


Note l:

If I manually access CM and fix ad_kdc_domain value and add using "
​/cm​/commands​/importAdminCredentials" endpoint works properly.

ECS 1.4.1 Blocked with IPA

Due to https://jira.cloudera.com/browse/DWX-13856 (internal cloudera JIRA).
Private Cloud ECS 1.4.1 clusters can only work with AD or MIT as kerberos provider. Although IPA is not formally supported, certain workarounds have enabled this for ECS 1.4.0 and prior, this new issue in 1.4.1 will break IPA workarounds, targeted to be fixed next version of ECS, workarounds should work again on IPA.

many role references relative and not FQCN

Hi @Chaffelson
I stumbled upon 2 cloudera_manager/config tasks not yet using the collection notation, in:
cloudera.cluster/roles/deployment/cluster/tasks/main.yml
cloudera.cluster/roles/cloudera_manager/external_auth/tasks/main.yml

( I did not look yet for others, but I find more will update here )
In case there's a reason they can't be migrated I'ld suggest adding a comment 👍

Originally posted by @lhoss in #1 (comment)

Cloudera API failing to import cluster template

msg": "Cluster template import failed. Result message: Cannot deserialize instance of java.lang.String out of START_ARRAY token\n at [Source: (org.apache.cxf.transport.http.AbstractHTTPDestination$1); line: 1, column: 681] (through reference chain: com.cloudera.api.model.ApiClusterTemplate["instantiator"]->com.cloudera.api.model.ApiClusterTemplateInstantiator["variables"]->java.util.ArrayList[0]->com.cloudera.api.model.ApiClusterTemplateVariable["value"])\n"

Security example

Hey,

in your V2 release are some examples for deployment (basic-7.1.x & sample).
Can you as well provide an example for a security deployment with Kerberos & TLS?

Thanks,
Yannik

Hive service dependency : spark_on_yarn on CDP

Using :
cloudera.cluster: 4.0.0-rc1

I'm getting : Unknown configuration attribute ''spark_on_yarn_service'' for service (type: ''HIVE'', name: ''hive'').' in the task cloudera.cluster.cluster : Import cluster template

Removing line spark_on_yarn_service: spark_on_yarn from HIVE.SERVICEWIDE in cloudera/cluster/roles/config/cluster/base/templates/configs/inter-service-dependencies.j2 solved the issue

cloudera_manager_host variable should use FQDN

My inventory looks like this.

[cloudera_manager]
cloudera-1 ansible_host=HOSTNAME_REMOVED ansible_ssh_user=root

...

Now, when running the playbooks, I get the following error.

TASK [cloudera_manager/license : Get current Cloudera license status] ****************************************************************************************************************
fatal: [cloudera-1]: FAILED! => {"changed": false, "content": "", "elapsed": 0, "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno -2] Name or service not known>", "redirected": false, "status": -1, "url": "http://cloudera-1:7180/api/version"}

As you can see, the playbook tries to reach cloudera-1, and not the fully qualified domain name (FQDN) of the host.

The reason seems to be that roles/cloudera_manager/common/defaults/main.yml sets cloudera_manager_host to the first entry in group [cloudera_manager], which is then used in roles/cloudera_manager/api_client/action_plugins/cm_api.py, where cm_api is used in the above mentioned task.

Applying the following change to roles/cloudera_manager/common/defaults/main.yml helps.

$ git diff roles/cloudera_manager/common/defaults/main.yml
diff --git a/roles/cloudera_manager/common/defaults/main.yml b/roles/cloudera_manager/common/defaults/main.yml
index fe98ef5..d13de70 100644
--- a/roles/cloudera_manager/common/defaults/main.yml
+++ b/roles/cloudera_manager/common/defaults/main.yml
@@ -15,7 +15,7 @@
 ---
 cloudera_manager_agent_config_file: /etc/cloudera-scm-agent/config.ini
 cloudera_manager_protocol: http
-cloudera_manager_host: "{{ groups.cloudera_manager | first | default('localhost') }}"
+cloudera_manager_host: "{{ hostvars[groups.cloudera_manager | first]['ansible_fqdn'] | default('localhost') }}"
 cloudera_manager_port: 7180
 cloudera_manager_database_embedded: False
 cloudera_manager_database_host: "{{ database_host }}"

(Depending on the intention, moving the part default('localhost') inside hostvars[] may be more suitable.)

Unfortunately, there is a new error when running the playbook now.

TASK [cloudera_manager/license : Get current Cloudera license status] ****************************************************************************************************************
fatal: [cloudera-1]: FAILED! => {"changed": false, "content": "", "elapsed": 0, "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno 101] Network is unreachable>", "redirected": false, "status": -1, "url": "http://HOSTNAME_REMOVED:7180/api/version"}

I have to do a little more digging in order to understand why the manager is not running.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.