linux-system-roles / ha_cluster Goto Github PK
View Code? Open in Web Editor NEWProvide automation for Cluster - High Availability management
Home Page: https://linux-system-roles.github.io/ha_cluster/
License: MIT License
Provide automation for Cluster - High Availability management
Home Page: https://linux-system-roles.github.io/ha_cluster/
License: MIT License
I am running the role against hosts with hostnames primary_8_6
and secondary_8_6
. The task Set hacluster password
fails on both with the invalid characters in salt
error. https://github.com/linux-system-roles/ha_cluster/blob/master/tasks/install-and-configure-packages.yml#L52 must do replace('_','x')
too.
We also must examine what other symbols that might appear in hostnames are not allowed in salt.
As part of the conscious language project, the master branch is to be renamed to the main branch.
Here are the instructions.
If you use the gh cli (highly recommended) you can use this to check which repos need to be updated:
gh repo list linux-system-roles -L 100 --json name,defaultBranchRef --source | \
jq --raw-output '.[] | select(.defaultBranchRef.name == "master") | .name'
Thanks.
The ha_cluster_manage_firewall: true attribute does not alter the firewalld configuration for the qdevice node.
In tasks/main.yml in task "Install and configure HA cluster" the firewall.yml inclusion only applies when the ha_cluster_cluster_present is true which will always be false for the qdevice.
Looks like the firewall.yml inclusion should also be added in tasks/shell_pcs/pcs-qnetd.yml
Available firewalld services in qdevice after running the role.
[root@qdevice ~]#
[root@qdevice ~]# firewall-cmd --list-services
cockpit dhcpv6-client ssh
[root@qdevice ~]#
doc: semantic specificity
Looking at the README with fresh eyes, I notice the word Support
is used frequently. Given this is a community repository, I would anticipate there is no (paid-for) support of the code, and the correct word for inference (particularly when translated) should instead be Compatibility
/ Compatible
.
Example:
See sbd(8) man page, section 'Configuration via environment' for their description.
Supported options are:
This is of course cloud specific so I am not sure if you want to implement it here as I dont see any cloud platform specific code
RedHat_9 var file should be enhanced with __ha_cluster_fullstack_node_packages from main RedHat var file to cover for newly split package on RHEL9.
Package to add: resource-agents-cloud
It contains:
/usr/lib/ocf/resource.d/heartbeat/aliyun-vpc-move-ip
/usr/lib/ocf/resource.d/heartbeat/aws-vpc-move-ip
/usr/lib/ocf/resource.d/heartbeat/aws-vpc-route53
/usr/lib/ocf/resource.d/heartbeat/awseip
/usr/lib/ocf/resource.d/heartbeat/awsvip
/usr/lib/ocf/resource.d/heartbeat/azure-events
/usr/lib/ocf/resource.d/heartbeat/azure-events-az
/usr/lib/ocf/resource.d/heartbeat/azure-lb
/usr/lib/ocf/resource.d/heartbeat/gcp-ilb
/usr/lib/ocf/resource.d/heartbeat/gcp-pd-move
/usr/lib/ocf/resource.d/heartbeat/gcp-vpc-move-route
/usr/lib/ocf/resource.d/heartbeat/gcp-vpc-move-vip
Explanation:
RHEL9 changed resource-agents package and it no longer contains cloud resource agents. They are now under resource-agents-cloud.
Example for REL9.2 on AWS:
resource-agents-4.10.0-34.el9_2.2.x86_64
[root@rhel9ha0 ~]# rpm -ql resource-agents | grep aws-vpc-move-ip
/usr/share/man/man7/ocf_heartbeat_aws-vpc-move-ip.7.gz
[root@rhel9ha0 ~]#
resource-agents-cloud-4.10.0-34.el9_2.2.x86_64
[root@rhel9ha0 ~]# rpm -ql resource-agents-cloud | grep aws-vpc-move-ip
/usr/lib/ocf/resource.d/heartbeat/aws-vpc-move-ip
/usr/share/man/man7/ocf_heartbeat_aws-vpc-move-ip.7.gz
Hello Team,
I am working on sap-linuxlab (community.sap_install) and our plan is to make sure that role sap_ha_pacemaker_cluster, which consumes fedora.linux_system_roles, is correctly working on SUSE systems.
I noticed that groundwork for adoption of non-pcs steps was already done thanks to Sean in #122, so adoption would consist of:
There are few things that I wanted to ask for clarification, before proceeding with any changes in fork:
Hello, we have a project to eliminate non-inclusive language from the linux-system-roles.
Running a utility woke
(now it's supported in tox
. please install the latest tox-lsr
and run tox -e woke
), two non-inclusive words are reported - dummy
and slave
.
Can we replace them with more appropriate words? For the word dummy
, placeholder
, and sample
are recommended. For slave
, secondary
, replica
, responder
, device,worker,proxy
, and performer
are the candidates. It looks to me that replacing dummy
is straightforward. But not certain about slave
. I wonder if it is doable? Or if we replace it, does it break the ha cluster? Thanks!
Issue:
pcs corosync commands currently create corosync file with predefined default values that are not exposed through variables.
Example: logging
https://github.com/linux-system-roles/ha_cluster/blob/main/tasks/shell_pcs/pcs-cluster-setup-pcs-0.10.yml does not specify logging, but it is created by default.
Reason:
ha_cluster_totem
Resolution:
It would be helpful if variables (ex. like ha_cluster_logging
) were added into corosync setup tasks as well as exposed in defaults and Readme.
This new task is failing when run in check-mode against a cluster that was configured using a previous LSR version:
ha_cluster/tasks/pcs-qnetd.yml
Line 3 in bea2773
Collection Version
------------------------- -------
fedora.linux_system_roles 1.30.5
Setup
Cluster built using a previous version of the LSR (no qnetd support).
Dry-run using --check
against the existing cluster using the newer LSR (no change of input parameters).
Dry-run fails because the task is designed to force an actual configuration change, even for a check, which fails due to the missing corosync-qnetd package.
Issue
Is it really desired to force remove the qnetd config even during a --check
run?
As a user I'm surprised by this behavior, as I would not expect any changes on the systems when running the playbook in check-mode explicitly.
By design, ha_cluster
allows for additional platform/version. The first Ansible Task executed within the Ansible Role, is set_vars.yml
to import different variables for OS major.minor versions.
An extension of this principle to also include/execute a different set of Ansible Tasks for different high-level command interfaces to Linux Pacemaker, would be valuable to future compatibility.
_ha_cluster_pacemaker_shell
to all existing /vars/<<os_version>>.yml
files. For example:/vars/RHEL9.yml
_ha_cluster_pacemaker_shell: pcs
/tasks/pcs
directory, and create stub code for crmsh shell in future. For example:Directory re-structure....
/tasks/pcs/cluster-start-and-reload.yml
etc.
/tasks/crmsh/cluster-start-and-reload.yml
etc.
cibadmin
and crm_mon
into a separate tasks subdirectory. For example:Directory re-structure....
/tasks/common/create-and-push-cib.yml
/tasks/common/cluster-start-and-reload.yml
/tasks/main.yml
---
- name: Set platform/version specific variables
include_tasks: set_vars.yml
...
...
- name: Start the cluster and reload corosync.conf
include_tasks: {{ _ha_cluster_pacemaker_shell }}/cluster-start-and-reload.yml
...
...
/tasks
directory, but reduces some of the contents so it is easier to see the key/controlling Ansible Tasks files and what they executepcs
and crmsh
Looking in the repository, there are a limited number of pcs
shell commands in use. See equivilant below for crmsh
:
# PCS Shell # CRMSH
pcs cluster crm node
pcs constraint <type> crm configure <type>
pcs property crm configure property
pcs qdevice crm cluster init qdevice
pcs quorum corosync-quorumtool / corosync-qnetd-tool
pcs resource crm ra
pcs status crm status
pcs stonith crm ra
Refs:
In tasks/cluster-enable-disable.yml
there is a comment that SBD is currently disabled because it is not supported, yet:
# The role does not support configuring SBD yet, therefore we always disable it
Are there plans to support it in the future? I'm using this role to setup RHEL HA and the official documentation mentions that SBD is supported by Red Hat.
Ansible Task 'Create a corosync.conf file content using pcs-0.10' has an error on re-run using RHEL 8.4 with pcs cluster setup
option --overwrite not recognized
This option does not exist, is this supposed to be --force
?
Version:
[root@host-p ~]# pcs --version
0.10.4
TASK [fedora.linux_system_roles.ha_cluster : Create a corosync.conf file content using pcs-0.10] *********
fatal: [host-p]: FAILED! =>
{
"changed": true,
"cmd": [
"pcs",
"cluster",
"setup",
"--corosync_conf",
"/tmp/ansible.5vjb9txg_ha_cluster_corosync_conf",
"--overwrite",
"--",
"clusterhdb",
"host-p",
"host-s"
],
"delta": "0:00:00.299788",
"end": "2023-04-27 23:38:30.909534",
"msg": "non-zero return code",
"rc": 1,
"start": "2023-04-27 23:38:30.609746",
"stderr": "",
"stderr_lines": [],
"stdout_lines": [
"option --overwrite not recognized"
]
}
I've been trying to set a RHEL HA cluster using the ha_cluster system-role but I haven't found a way to define cluster members' attributes, this is required to define different cluster constraints.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.