GithubHelp home page GithubHelp logo

pyratlabs / ansible-role-k3s Goto Github PK

View Code? Open in Web Editor NEW
622.0 24.0 136.0 490 KB

Ansible role for installing k3s as either a standalone server or HA cluster.

License: BSD 3-Clause "New" or "Revised" License

Python 5.11% Jinja 74.56% Dockerfile 9.22% Shell 11.10%
k3s ansible playbook cluster standalone-server k8s k8s-cluster k3s-cluster kubernetes kubernetes-cluster ansible-role

ansible-role-k3s's Introduction

Ansible Role: k3s (v3.x)

Ansible role for installing K3S ("Lightweight Kubernetes") as either a standalone server or cluster.

CI

Help Wanted!

Hi! ๐Ÿ‘‹ @xanmanning is looking for a new maintainer to work on this Ansible role. This is because I don't have as much free time any more and I no longer write Ansible regularly as part of my day job. If you're interested, get in touch.

Release notes

Please see Releases and CHANGELOG.md.

Requirements

The host you're running Ansible from requires the following Python dependencies:

  • python >= 3.6.0 - See Notes below.
  • ansible >= 2.9.16 or ansible-base >= 2.10.4

You can install dependencies using the requirements.txt file in this repository: pip3 install -r requirements.txt.

This role has been tested against the following Linux Distributions:

  • Alpine Linux
  • Amazon Linux 2
  • Archlinux
  • CentOS 8
  • Debian 11
  • Fedora 31
  • Fedora 32
  • Fedora 33
  • openSUSE Leap 15
  • RockyLinux 8
  • Ubuntu 20.04 LTS

โš ๏ธ The v3 releases of this role only supports k3s >= v1.19, for k3s < v1.19 please consider updating or use the v1.x releases of this role.

Before upgrading, see CHANGELOG for notifications of breaking changes.

Role Variables

Since K3s v1.19.1+k3s1 you can now configure K3s using a configuration file rather than environment variables or command line arguments. The v2 release of this role has moved to the configuration file method rather than populating a systemd unit file with command-line arguments. There may be exceptions that are defined in Global/Cluster Variables, however you will mostly be configuring k3s by configuration files using the k3s_server and k3s_agent variables.

See "Server (Control Plane) Configuration" and "Agent (Worker) Configuraion" below.

Global/Cluster Variables

Below are variables that are set against all of the play hosts for environment consistency. These are generally cluster-level configuration.

Variable Description Default Value
k3s_state State of k3s: installed, started, stopped, downloaded, uninstalled, validated. installed
k3s_release_version Use a specific version of k3s, eg. v0.2.0. Specify false for stable. false
k3s_airgap Boolean to enable air-gapped installations false
k3s_config_file Location of the k3s configuration file. /etc/rancher/k3s/config.yaml
k3s_build_cluster When multiple play hosts are available, attempt to cluster. Read notes below. true
k3s_registration_address Fixed registration address for nodes. IP or FQDN. NULL
k3s_github_url Set the GitHub URL to install k3s from. https://github.com/k3s-io/k3s
k3s_api_url URL for K3S updates API. https://update.k3s.io
k3s_install_dir Installation directory for k3s. /usr/local/bin
k3s_install_hard_links Install using hard links rather than symbolic links. false
k3s_server_config_yaml_d_files A flat list of templates to supplement the k3s_server configuration. []
k3s_agent_config_yaml_d_files A flat list of templates to supplement the k3s_agent configuration. []
k3s_server_manifests_urls A list of URLs to deploy on the primary control plane. Read notes below. []
k3s_server_manifests_templates A flat list of templates to deploy on the primary control plane. []
k3s_server_pod_manifests_urls A list of URLs for installing static pod manifests on the control plane. Read notes below. []
k3s_server_pod_manifests_templates A flat list of templates for installing static pod manifests on the control plane. []
k3s_use_experimental Allow the use of experimental features in k3s. false
k3s_use_unsupported_config Allow the use of unsupported configurations in k3s. false
k3s_etcd_datastore Enable etcd embedded datastore (read notes below). false
k3s_debug Enable debug logging on the k3s service. false
k3s_registries Registries configuration file content. { mirrors: {}, configs:{} }

K3S Service Configuration

The below variables change how and when the systemd service unit file for K3S is run. Use this with caution, please refer to the systemd documentation for more information.

Variable Description Default Value
k3s_start_on_boot Start k3s on boot. true
k3s_service_requires List of required systemd units to k3s service unit. []
k3s_service_wants List of "wanted" systemd unit to k3s (weaker than "requires"). []*
k3s_service_before Start k3s before a defined list of systemd units. []
k3s_service_after Start k3s after a defined list of systemd units. []*
k3s_service_env_vars Dictionary of environment variables to use within systemd unit file. {}
k3s_service_env_file Location on host of a environment file to include. false**

* The systemd unit template always specifies network-online.target for wants and after.

** The file must already exist on the target host, this role will not create nor manage the file. You can manage this file outside of the role with pre-tasks in your Ansible playbook.

Group/Host Variables

Below are variables that are set against individual or groups of play hosts. Typically you'd set these at group level for the control plane or worker nodes.

Variable Description Default Value
k3s_control_node Specify if a host (or host group) are part of the control plane. false (role will automatically delegate a node)
k3s_server Server (control plane) configuration, see notes below. {}
k3s_agent Agent (worker) configuration, see notes below. {}

Server (Control Plane) Configuration

The control plane is configured with the k3s_server dict variable. Please refer to the below documentation for configuration options:

https://rancher.com/docs/k3s/latest/en/installation/install-options/server-config/

The k3s_server dictionary variable will contain flags from the above (removing the -- prefix). Below is an example:

k3s_server:
  datastore-endpoint: postgres://postgres:verybadpass@database:5432/postgres?sslmode=disable
  cluster-cidr: 172.20.0.0/16
  flannel-backend: 'none'  # This needs to be in quotes
  disable:
    - traefik
    - coredns

Alternatively, you can create a .yaml file and read it in to the k3s_server variable as per the below example:

k3s_server: "{{ lookup('file', 'path/to/k3s_server.yml') | from_yaml }}"

Check out the Documentation for example configuration.

Agent (Worker) Configuration

Workers are configured with the k3s_agent dict variable. Please refer to the below documentation for configuration options:

https://rancher.com/docs/k3s/latest/en/installation/install-options/agent-config

The k3s_agent dictionary variable will contain flags from the above (removing the -- prefix). Below is an example:

k3s_agent:
  with-node-id: true
  node-label:
    - "foo=bar"
    - "hello=world"

Alternatively, you can create a .yaml file and read it in to the k3s_agent variable as per the below example:

k3s_agent: "{{ lookup('file', 'path/to/k3s_agent.yml') | from_yaml }}"

Check out the Documentation for example configuration.

Ansible Controller Configuration Variables

The below variables are used to change the way the role executes in Ansible, particularly with regards to privilege escalation.

Variable Description Default Value
k3s_skip_validation Skip all tasks that validate configuration. false
k3s_skip_env_checks Skip all tasks that check environment configuration. false
k3s_skip_post_checks Skip all tasks that check post execution state. false
k3s_become Escalate user privileges for tasks that need root permissions. false

Important note about Python

From v3 of this role, Python 3 is required on the target system as well as on the Ansible controller. This is to ensure consistent behaviour for Ansible tasks as Python 2 is now EOL.

If target systems have both Python 2 and Python 3 installed, it is most likely that Python 2 will be selected by default. To ensure Python 3 is used on a target with both versions of Python, ensure ansible_python_interpreter is set in your inventory. Below is an example inventory:

---

k3s_cluster:
  hosts:
    kube-0:
      ansible_user: ansible
      ansible_host: 10.10.9.2
      ansible_python_interpreter: /usr/bin/python3
    kube-1:
      ansible_user: ansible
      ansible_host: 10.10.9.3
      ansible_python_interpreter: /usr/bin/python3
    kube-2:
      ansible_user: ansible
      ansible_host: 10.10.9.4
      ansible_python_interpreter: /usr/bin/python3

Important note about k3s_release_version

If you do not set a k3s_release_version the latest version from the stable channel of k3s will be installed. If you are developing against a specific version of k3s you must ensure this is set in your Ansible configuration, eg:

k3s_release_version: v1.19.3+k3s1

It is also possible to install specific K3s "Channels", below are some examples for k3s_release_version:

k3s_release_version: false             # defaults to 'stable' channel
k3s_release_version: stable            # latest 'stable' release
k3s_release_version: testing           # latest 'testing' release
k3s_release_version: v1.19             # latest 'v1.19' release
k3s_release_version: v1.19.3+k3s3      # specific release

# Specific commit
# CAUTION - only used for testing - must be 40 characters
k3s_release_version: 48ed47c4a3e420fa71c18b2ec97f13dc0659778b

Important note about k3s_install_hard_links

If you are using the system-upgrade-controller you will need to use hard links rather than symbolic links as the controller will not be able to follow symbolic links. This option has been added however is not enabled by default to avoid breaking existing installations.

To enable the use of hard links, ensure k3s_install_hard_links is set to true.

k3s_install_hard_links: true

The result of this can be seen by running the following in k3s_install_dir:

ls -larthi | grep -E 'k3s|ctr|ctl' | grep -vE ".sh$" | sort

Symbolic Links:

[root@node1 bin]# ls -larthi | grep -E 'k3s|ctr|ctl' | grep -vE ".sh$" | sort
3277823 -rwxr-xr-x 1 root root  52M Jul 25 12:50 k3s-v1.18.4+k3s1
3279565 lrwxrwxrwx 1 root root   31 Jul 25 12:52 k3s -> /usr/local/bin/k3s-v1.18.6+k3s1
3279644 -rwxr-xr-x 1 root root  51M Jul 25 12:52 k3s-v1.18.6+k3s1
3280079 lrwxrwxrwx 1 root root   31 Jul 25 12:52 ctr -> /usr/local/bin/k3s-v1.18.6+k3s1
3280080 lrwxrwxrwx 1 root root   31 Jul 25 12:52 crictl -> /usr/local/bin/k3s-v1.18.6+k3s1
3280081 lrwxrwxrwx 1 root root   31 Jul 25 12:52 kubectl -> /usr/local/bin/k3s-v1.18.6+k3s1

Hard Links:

[root@node1 bin]# ls -larthi | grep -E 'k3s|ctr|ctl' | grep -vE ".sh$" | sort
3277823 -rwxr-xr-x 1 root root  52M Jul 25 12:50 k3s-v1.18.4+k3s1
3279644 -rwxr-xr-x 5 root root  51M Jul 25 12:52 crictl
3279644 -rwxr-xr-x 5 root root  51M Jul 25 12:52 ctr
3279644 -rwxr-xr-x 5 root root  51M Jul 25 12:52 k3s
3279644 -rwxr-xr-x 5 root root  51M Jul 25 12:52 k3s-v1.18.6+k3s1
3279644 -rwxr-xr-x 5 root root  51M Jul 25 12:52 kubectl

Important note about k3s_build_cluster

If you set k3s_build_cluster to false, this role will install each play host as a standalone node. An example of when you might use this would be when building a large number of standalone IoT devices running K3s. Below is a hypothetical situation where we are to deploy 25 Raspberry Pi devices, each a standalone system and not a cluster of 25 nodes. To do this we'd use a playbook similar to the below:

- hosts: k3s_nodes  # eg. 25 RPi's defined in our inventory.
  vars:
    k3s_build_cluster: false
  roles:
     - xanmanning.k3s

Important note about k3s_control_node and High Availability (HA)

By default only one host will be defined as a control node by Ansible, If you do not set a host as a control node, this role will automatically delegate the first play host as a control node. This is not suitable for use within a Production workload.

If multiple hosts have k3s_control_node set to true, you must also set datastore-endpoint in k3s_server as the connection string to a MySQL or PostgreSQL database, or external Etcd cluster else the play will fail.

If using TLS, the CA, Certificate and Key need to already be available on the play hosts.

See: High Availability with an External DB

It is also possible, though not supported, to run a single K3s control node with a datastore-endpoint defined. As this is not a typically supported configuration you will need to set k3s_use_unsupported_config to true.

Since K3s v1.19.1 it is possible to use an embedded Etcd as the backend database, and this is done by setting k3s_etcd_datastore to true. The best practice for Etcd is to define at least 3 members to ensure quorum is established. In addition to this, an odd number of members is recommended to ensure a majority in the event of a network partition. If you want to use 2 members or an even number of members, please set k3s_use_unsupported_config to true.

Important note about k3s_server_manifests_urls and k3s_server_pod_manifests_urls

To deploy server manifests and server pod manifests from URL, you need to specify a url and optionally a filename (if none provided basename is used). Below is an example of how to deploy the Tigera operator for Calico and kube-vip.

---

k3s_server_manifests_urls:
  - url: https://docs.projectcalico.org/archive/v3.19/manifests/tigera-operator.yaml
    filename: tigera-operator.yaml

k3s_server_pod_manifests_urls:
  - url: https://raw.githubusercontent.com/kube-vip/kube-vip/main/example/deploy/0.1.4.yaml
    filename: kube-vip.yaml

Important note about k3s_airgap

When deploying k3s in an air gapped environment you should provide the k3s binary in ./files/. The binary will not be downloaded from Github and will subsequently not be verified using the provided sha256 sum, nor able to verify the version that you are running. All risks and burdens associated are assumed by the user in this scenario.

Dependencies

No dependencies on other roles.

Example Playbooks

Example playbook, single control node running testing channel k3s:

- hosts: k3s_nodes
  vars:
    k3s_release_version: testing
  roles:
     - role: xanmanning.k3s

Example playbook, Highly Available with PostgreSQL database running the latest stable release:

- hosts: k3s_nodes
  vars:
    k3s_registration_address: loadbalancer  # Typically a load balancer.
    k3s_server:
      datastore-endpoint: "postgres://postgres:verybadpass@database:5432/postgres?sslmode=disable"
  pre_tasks:
    - name: Set each node to be a control node
      ansible.builtin.set_fact:
        k3s_control_node: true
      when: inventory_hostname in ['node2', 'node3']
  roles:
    - role: xanmanning.k3s

License

BSD 3-clause

Contributors

Contributions from the community are very welcome, but please read the contribution guidelines before doing so, this will help make things as streamlined as possible.

Also, please check out the awesome list of contributors.

Author Information

Xan Manning

ansible-role-k3s's People

Contributors

abdennour avatar andrewchen5678 avatar angelnu avatar anjia0532 avatar bdronneau avatar bjw-s avatar carpenike avatar clrxbl avatar crutonjohn avatar dbrennand avatar eaglesemanation avatar fragpit avatar janar153 avatar jdmarble avatar jonaprince avatar kossmac avatar matteyeux avatar mbwmbw1337 avatar mrobinsn avatar niklasweimann avatar nolte avatar onedr0p avatar paradon avatar quulah avatar simonheimberg avatar sm-gravid-day avatar t-nelis avatar xanmanning avatar xlejo avatar yajo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-role-k3s's Issues

Templating error on fresh install in tasks/teardown/drain-and-remove-nodes.yml:3

Hi,

I am trying to setup a fresh cluster with the latest version from galaxy, but I get the following error:

fatal: [kube_node1]: FAILED! => { "msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)" } fatal: [kube_node2]: FAILED! => { "msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)" } fatal: [kube_master]: FAILED! => { "msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)" }

My site.yml:

- hosts: all roles: - { role: xanmanning.k3s}

I call ansible-playbook:
ansible-playbook site.yml -i inventories/stackit-test/hosts -vvv -become

How to upgrade a cluster deployed with ansible-role-k3s playbook?

Hi,

I've deployed a cluster with this ansible playbook and I would like to upgrade it to a patch version of kubernetes.
It's not clear to me how can I do that.

Can you share some the list of steps that I need to do?
I can try them out and add them to the documentation.

I've also noticed the documentation does not cover this steps https://github.com/PyratLabs/ansible-role-k3s/tree/main/documentation .

p.s. I'm using version 1.17.4 of this playbook.

include full config file including defaults

Summary

The recent change to config file based configuration removed the default values from this role
Would be great if we could include all variables including their defaults in the role

K3s installation causes external-ip to show up as internal-ip

image

Screenshot of the INTERNAL-IP & EXTERNAL-IP in kubectl get nodes -o wide

Facts are set but I have no clue why it's not following them for setting the external-ip:

k3s_node_ip_address: "{{ vars['ansible_'~iface].ipv4.address }}"
k3s_tls_san: "{{ vars['ansible_'~iface].ipv4.address }}"
k3s_bind_address: "{{ vars['ansible_'~iface].ipv4.address }}"
k3s_external_address: "{{ vars['ansible_'~iface].ipv4.address }}"

State per node cannot be changed with group/host variables

Currently, the playbook uses static import to identify which state should be applied for all nodes and cannot be changed on a per-node basis. This means it's not currently possible to remove a single node from the cluster without affecting all other nodes. And also I'd prefer to keep all vars in host/group vars, but now I should use playbook vars if I want to delete cluster from nodes.

Ansible documentation

Unable to deploy 1 master when datastore_endpoint is defined

I'm unable to deploy just 1 master when the k3s_datastore_endpoint is defined.

TASK [xanmanning.k3s : Check the conditions when a single controller is defined] 
fatal: [master1.ansible.iptables.sh]: FAILED! => {
    "assertion": "(k3s_controller_count | length == 1) and (k3s_datastore_endpoint is not defined or not k3s_datastore_endpoint) and (k3s_dqlite_datastore is not defined or not k3s_dqlite_datastore)",
    "changed": false,
    "evaluated_to": false,
    "msg": "Control plane configuration is invalid. Please see notes about k3s_control_node and HA in README.md."
}

Is there any reason it fails? I know it's not ideal to have 1 master in a HA setup, but it's just for testing purposes. Or is K3s itself refusing to run when there's only 1 master but a datastore_endpoint is defined?

EDIT: Looking at Rancher's documentation about K3s HA, it does state "K3s requires two or more server nodes for this HA configuration", so it might actually be a K3s limit, but perhaps you know more.

Step "Ensure NODE_TOKEN is captured from control node" fails when k3s_node_data_dir is set

I have some external (GlusterFS) storage on my RPi cluster, and want to use it for my node data to keep IO off the SD cards. I set:

k3s_node_data_dir: /mnt/vol1/data/k3s-node-data/{{ inventory_hostname_short }}

However, the play fails with:

TASK [xanmanning.k3s : Ensure NODE_TOKEN is captured from control node] ***********************************************************************
fatal: [pi3.example.com -> pi1.example.com]: FAILED! => {"changed": false, "msg": "file not found: /var/lib/rancher/k3s/server/node-token"}
fatal: [pi1.example.com -> pi1.example.com]: FAILED! => {"changed": false, "msg": "file not found: /var/lib/rancher/k3s/server/node-token"}
fatal: [pi2.example.com -> pi1.example.com]: FAILED! => {"changed": false, "msg": "file not found: /var/lib/rancher/k3s/server/node-token"}

Presumably the path used here should look at k3s_node_data_dir rather than hard-coding /var/lib/rancher/k3s

I've hacked around it for now by creating a symlink (/var/lib/rancher/k3s/server/node-token -> /mnt/vol1/data/k3s-node-data/{{ inventory_hostname_short }}/server/node-token) prior to running the playbook (alas it seems my GlusterFS volume does not perform well enough for etcd, so the cluster will not start cleanly, but that's a separate issue).

Apart from this, some items are still being written to /var/lib/rancher/k3s:

/var/lib/rancher/k3s/agent:
total 4
drwxr-xr-x 2 root root 4096 Oct 19 11:21 etc

/var/lib/rancher/k3s/data:
total 8
lrwxrwxrwx 1 root root   90 Oct 19 11:35 current -> /var/lib/rancher/k3s/data/fbfb39c320a19fb3272085217e4af756f65827d57ae252b6b1e4277fa6d47b45
drwxr-xr-x 4 root root 4096 Oct 19 11:35 fbfb39c320a19fb3272085217e4af756f65827d57ae252b6b1e4277fa6d47b45

I also note that the k3s-uninstall.sh script doesn't remove the k3s_node_data_dir if it differs from the default; only /var/lib/rancher/k3s is removed.

This is using v0.14.2 of the role.

Configuration of etcd snapshots required

With k3s v1.19 - Etcd embedded is now available experimentally, the option to create snapshots has been added to the k3s binary. This should also be present in this ansible role.

   --etcd-disable-snapshots                   (db) Disable automatic etcd snapshots
   --etcd-snapshot-schedule-cron value        (db) Snapshot interval time in cron spec. eg. every 5 hours '* */5 * * *' (default: "0 */12 * * *")
   --etcd-snapshot-retention value            (db) Number of snapshots to retain (default: 5)
   --etcd-snapshot-dir value                  (db) Directory to save db snapshots. (Default location: ${data-dir}/db/snapshots)

Ensure all instances of 'master' are renamed to more inclusive synonyms

In line with the wider tech community, and as a matter of avoiding any unnecessary discomfort or distress, any references to the legacy term "master" should be removed from this project.

K8s has marked this term as legacy, GitHub has renamed the default branch main on new repositories with mechanisms for migrating existing repositories arriving by the end of this year.

Tasks

The following should be evaluated and (hopefully) completed prior to the v2 release of this role.

  • Remove legacy terms from Documentation
  • Remove legacy terms from comments in YAML
  • Remove legacy terms from task names
  • Remove legacy terms from Ansible variables/facts
    • Ensure any variables that are renamed have rename tasks to avoid breakage to existing deployments.
  • Migrate default branch to main (Low priority, this should happen automatically)

Sources

all nodes joined as masters

while experimenting with a rpi cluster all nodes are joined as master
how can i join the nodes as non master?

i am using the master branch of the ansible role

nodes

NAME    STATUS   ROLES    AGE   VERSION        INTERNAL-IP      EXTERNAL-IP     OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
rpi01   Ready    master   25h   v1.19.3+k3s1   192.168.10.101   192.168.2.101   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
rpi07   Ready    master   25h   v1.19.3+k3s1   192.168.10.107   192.168.2.107   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
rpi02   Ready    master   25h   v1.19.3+k3s1   192.168.10.102   192.168.2.102   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
rpi06   Ready    master   25h   v1.19.3+k3s1   192.168.10.106   192.168.2.106   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
rpi04   Ready    master   25h   v1.19.3+k3s1   192.168.10.104   192.168.2.104   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
rpi05   Ready    master   25h   v1.19.3+k3s1   192.168.10.105   192.168.2.105   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
rpi03   Ready    master   25h   v1.19.3+k3s1   192.168.10.103   192.168.2.103   Ubuntu 20.04.1 LTS   5.4.0-1022-raspi   containerd://1.4.0-k3s1
zbox    Ready    master   25h   v1.19.3+k3s1   192.168.10.100   192.168.2.100   Arch Linux           5.9.1-arch1-1      containerd://1.4.0-k3s1

inventory

all:
  children:
    k3s_nodes:
      hosts:
        rpi01:
          ansible_host: 192.168.2.101
          k3s_advertise_address: 192.168.10.101
          k3s_control_node: 'true'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.101
          k3s_node_ip_address: 192.168.10.101
        rpi02:
          ansible_host: 192.168.2.102
          k3s_advertise_address: 192.168.10.102
          k3s_control_node: 'true'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.102
          k3s_node_ip_address: 192.168.10.102
        rpi03:
          ansible_host: 192.168.2.103
          k3s_advertise_address: 192.168.10.103
          k3s_control_node: 'false'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.103
          k3s_node_ip_address: 192.168.10.103
        rpi04:
          ansible_host: 192.168.2.104
          k3s_advertise_address: 192.168.10.104
          k3s_control_node: 'false'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.104
          k3s_node_ip_address: 192.168.10.104
        rpi05:
          ansible_host: 192.168.2.105
          k3s_advertise_address: 192.168.10.105
          k3s_control_node: 'false'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.105
          k3s_node_ip_address: 192.168.10.105
        rpi06:
          ansible_host: 192.168.2.106
          k3s_advertise_address: 192.168.10.106
          k3s_control_node: 'false'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.106
          k3s_node_ip_address: 192.168.10.106
        rpi07:
          ansible_host: 192.168.2.107
          k3s_advertise_address: 192.168.10.107
          k3s_control_node: 'false'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.107
          k3s_node_ip_address: 192.168.10.107
        zbox:
          ansible_host: 192.168.2.100
          k3s_advertise_address: 192.168.10.100
          k3s_control_node: 'true'
          k3s_flannel_interface: vlan2
          k3s_node_external_address: 192.168.2.100
          k3s_node_ip_address: 192.168.10.100
    ungrouped: {}

vars

---
k3s_cluster_state: installed
k3s_release_version: v1.19.3+k3s1
k3s_build_cluster: true
k3s_use_experimental: true
k3s_datastore_endpoint: "mysql://XXXXXXX!@tcp(XXXXXX:3307)/k3s"
k3s_flannel_backend: "host-gw"
k3s_control_node_address: 192.168.10.100

check if kubectl exists error

updated to ansible 2.5.1 and modules version 1.8.1

Attempting to deploy to an existing cluster and getting the following errors:

TASK [xanmanning.k3s : Check if kubectl exists] **************************************************************************************************************************************
Thursday 16 July 2020  12:45:36 -0400 (0:00:00.216)       0:00:54.469 *********
fatal: [h2-0]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)"}
fatal: [r610-2]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)"}
fatal: [r610-0]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)"}
fatal: [r610-1]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)"}
fatal: [n2-0]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)"}
fatal: [r720-0]: FAILED! => {"msg": "Unexpected templating type error occurred on ({{ k3s_become_for_kubectl | ternary(true, false, k3s_become_for_all) }}): ternary() takes exactly 3 arguments (4 given)"}

Become should be used for call to k3s-killall.sh

If the role does not run as root on the playbook level, it fails during uninstallation with

TASK [xanmanning.k3s : Run k3s-killall.sh] **********************************************************************************************
fatal: [rpi-node01]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/k3s-killall.sh", "msg": "[Errno 13] Permission denied", "rc": 13}

Automatic upgrade is not working

I have tried to automate k3s upgrade from 1.17.5+k3s1 to 1.17.6+k3s1 by using system-upgrade-controller but it's not upgrading.

I have referred the below link to upgrade k3s and followed step by step
https://github.com/rancher/k3s-upgrade

I have raised the same issue in rancher/system-upgrade-controller. Issue - https://github.com/rancher/system-upgrade-controller/issues/90

After debugging, I got to know that the plan is looking for name k3s in /usr/local/bin however while installing from playbook it's installed as k3s- version (k3s-v1.17.5+k3s1). Because of this, I am getting an error "/host/usr/local/bin/k3s': No such file or directory".

Can you please suggest, what will be the best way to automate upgrade plan.

Ability to install k3s & join as agent node?

Hi,

First time Ansible user here. I'm trying to make 1 server install k3s and join an existing cluster using specified k3s_control_node_address, k3s_datastore_endpoint & k3s_control_token variables.

However, when doing so, I get the following message:

TASK [xanmanning.k3s : Check the conditions when a single controller is defined] 

fatal: [192.168.0.51]: FAILED! => {
    "assertion": "(k3s_controller_count | length == 1) and (k3s_datastore_endpoint is not defined or not k3s_datastore_endpoint) and (k3s_dqlite_datastore is not defined or not k3s_dqlite_datastore)",
    "changed": false,
    "evaluated_to": false,
    "msg": "Control plane configuration is invalid. Please see notes about k3s_control_node and HA in README.md."
}

Is it even possible to do what I want? If you need to take a look at my playbooks, they're over at this repository: https://github.com/clrxbl/ansible-ha-k3s

Mulitple single node clusters

I try to install multiple single node clusters on all my machines. Unfortunately I can't figure out how I can resolve this with that playbook. Each time I run the playbook it will select multiple master nodes.

k3s_no_metrics_server doesn't work

After many problems with getting metrics-server to work out of the box with k3s, I decided to disable it. I ran sudo /usr/local/bin/k3s-uninstall.sh on all my nodes, set k3s_no_metrics_server: true, and re-ran the playbook.

However, all these files got created anyway, and the metrics-server is running on my clean install.

pi@pi1:~ $ sudo ls -l /var/lib/rancher/k3s/server/manifests/metrics-server/
total 28
-rw------- 1 root root 393 Oct 18 12:42 aggregated-metrics-reader.yaml
-rw------- 1 root root 308 Oct 18 12:42 auth-delegator.yaml
-rw------- 1 root root 329 Oct 18 12:42 auth-reader.yaml
-rw------- 1 root root 298 Oct 18 12:42 metrics-apiservice.yaml
-rw------- 1 root root 955 Oct 18 12:42 metrics-server-deployment.yaml
-rw------- 1 root root 291 Oct 18 12:42 metrics-server-service.yaml
-rw------- 1 root root 517 Oct 18 12:42 resource-reader.yaml

This is v1.14.1 of the role.

Role fails when k3s_bind_address is set

Ansible tries to check a port on the localhost and fails as the port is not available.

TASK [xanmanning.k3s : Wait for control plane to be ready to accept connections] ********************************************************
fatal: [rpi]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for 127.0.0.1:6443"}

A solution could be to add host: "{{ k3s_bind_address }}" so it would check correct ip.

Ensure Rancher references can be removed

Summary

As k3s has now been donated to the CNCF by Rancher, there is a possibility that filesystem paths that contain "rancher" might be removed over time. This role should be able to handle that, ensuring all templated files and tasks refer to a variable that can be changed as/when required.

Issue Type

  • Feature Request

Additional Information

Drop Docker support(?)

Summary

The Kubernetes team are going to deprecate support for Docker after v1.20, this is likely to get filtered down into the k3s distribution. Installation of Docker has been handled by this role, however I have always debated whether or not this should be handled by another, external role (eg. Write my own or use geerlingguy.docker, or whatever).

Removal of docker support should come with documentation providing information about using alternative container runtimes

Issue Type

  • Feature Request

User Story

As a developer
I want guidance on container runtimes
So that I can run my application on K3s using a supported method.

Additional Information

Allow links to documentation to be added to validation steps

Summary

During validation steps, it would be good to have the ability to refer to documentation for topics that might require more explanation.

Issue Type

  • Feature Request

User Story

As a New user of this role
I want Validation checks to provide me with as much information as possible
So that I can find all the information I need easily and avoid raising support requests which take time to process.

Support for Helm Chart Config

Summary

k3s v1.19 added support for applying Helm chart config (mainly for Traefik I think). This isn't explicitly supported in this role yet and will likely work in a similar way to kube manifests.

Issue Type

  • Feature Request

User Story

As a Kubernetes Operator
I want to be able to pre-configure Helm charts prior to installation
So that I don't have to take steps to reconfigure Helm deployments post installation.

Additional Information

Add support for --kube-apiserver-args / audit logging

^

--kube-apiserver-args seems to be necessary for setting an audit log path.
Adding the ability to pass args through for kube-apiserver like kubelet-args would be great, or an option to easily set an audit path.

Issue when running k3s HA masters w/ embedded etcd

There is currently an issue where k3s tries to bootstrap the cluster again when a secondary node is rebooted.

The work around is to remove the --cluster and --token flags in the k3s.service file on the secondary masters.

I do not want to open an PR to resolved this, we should wait until k3s has provided the right fix. See this issue for context k3s-io/k3s#2249 (comment)

Only opening this issue in-case a fix is needed from upstream, and for visibility.

Ensure NODE_TOKEN is captured from control node: fatal

Summary

Running example playbook w/ 3 nodes.

Issue Type

TASK [xanmanning.k3s : Ensure NODE_TOKEN is captured from control node] **********************************************************************
fatal: [node003 -> xx.xx.xx.xx]: FAILED! => {"changed": false, "msg": "file not found: /var/lib/rancher/k3s/server/node-token"}
fatal: [node002 -> xx.xx.xx.xx]: FAILED! => {"changed": false, "msg": "file not found: /var/lib/rancher/k3s/server/node-token"}

  • Bug Report

Controller Environment and Configuration


Steps to Reproduce

Expected Result


Actual Result


--no-deploy option is deprecated

I try to disable the default traefik deployment with the provided option: k3s_no_traefik: true, which sets the argument --no-deploy traefik in the systemd unit file. This seems to be deprecated. Instead it should be done via the argument --disable traefik.

Doc: https://rancher.com/docs/k3s/latest/en/installation/install-options/server-config/#kubernetes-components

Is it possbile to get this in the playbook? Maybe it is better to provide a option to extend the k3s argument list by the user.

Regards,
Martin

k3s-uninstall.sh always fails

For some reason, k3s-uninstall.sh always finishes with exit code 1 so it causes ansible role to fail.

illia@rpi:~ $ sudo bash -xe /usr/local/bin/k3s-uninstall.sh
+ set -x
++ id -u
+ '[' 0 -eq 0 ']'
+ /usr/local/bin/k3s-killall.sh
+ [ -s /etc/systemd/system/k3s.service ]
+ basename /etc/systemd/system/k3s.service
+ systemctl stop k3s.service
+ [ -x /etc/init.d/k3s* ]
+ killtree 15950 16055 16096 16102
+ kill -9 15950 15974 16319 16055 16084 16096 16138 16318 16102 16145 16331
+ do_unmount /run/k3s
+ umount /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/f57d162f809cc2dbf65aa6a5cc4e5e3515edece680e332fdf5e2163fd9ecee46/rootfs /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/de600d07c60968558b780b09b3ab07ceadb54a1308c1ddda8de324061b40c045/rootfs /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/d39d7a66b9fb724a8d69666b6836c6f00cf5160c7a42841eed529b9c264496eb/rootfs /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/c327b869a89f0bf2060a4389b2a2c1b6bf6dd73854043cfe8d2cb604ee488abb/rootfs /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/a0693211b9df59b8f8e7f4e32fdc9cfa20ade0f3288f19fe0c7421753949ac55/rootfs /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/632c809445a1a290f4f94c96baa311f57900f9ef88727cbe930eb36a4c1c6037/rootfs /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/6151dddd7b7147f102ebcba231b2b38bdc18f587263204e05742a161e08e546c/rootfs /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/f57d162f809cc2dbf65aa6a5cc4e5e3515edece680e332fdf5e2163fd9ecee46/shm /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/de600d07c60968558b780b09b3ab07ceadb54a1308c1ddda8de324061b40c045/shm /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/632c809445a1a290f4f94c96baa311f57900f9ef88727cbe930eb36a4c1c6037/shm /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/6151dddd7b7147f102ebcba231b2b38bdc18f587263204e05742a161e08e546c/shm
+ do_unmount /var/lib/rancher/k3s
+ do_unmount /var/lib/kubelet/pods
+ umount /var/lib/kubelet/pods/e042fd4a-b557-467d-953a-f41eab1b1a1e/volumes/kubernetes.io~secret/helm-traefik-token-7ll7m /var/lib/kubelet/pods/d925e9d0-8068-4fc9-ab00-c1b28a0cfb13/volumes/kubernetes.io~secret/metrics-server-token-2x979 /var/lib/kubelet/pods/7ac4cbe3-e44a-4c6b-b246-9d1fe03d9b4b/volumes/kubernetes.io~secret/local-path-provisioner-service-account-token-nz9wz /var/lib/kubelet/pods/6818f60c-c9fe-46e8-9a86-835818065a68/volumes/kubernetes.io~secret/coredns-token-7qbqx
+ do_unmount /run/netns/cni-
+ umount /run/netns/cni-b587d96c-c1c8-7f08-19f3-7567bc0d0fba /run/netns/cni-66f81c8a-4745-3a0f-dffa-c2db87311049 /run/netns/cni-60b11ef0-f82f-1e16-519b-73ed773ee63a /run/netns/cni-34333f5e-a2f6-30f7-b6f6-795639fa9782
+ grep master cni0
+ read ignore iface ignore
+ ip link show
+ iface=vethbad18a00
+ [ -z vethbad18a00 ]
+ ip link delete vethbad18a00
+ read ignore iface ignore
+ iface=vethfdc7d97b
+ [ -z vethfdc7d97b ]
+ ip link delete vethfdc7d97b
+ read ignore iface ignore
+ iface=vethfe02aa78
+ [ -z vethfe02aa78 ]
+ ip link delete vethfe02aa78
+ read ignore iface ignore
+ ip link delete cni0
+ ip link delete flannel.1
+ [ -d /var/lib/cni ]
+ rm -rf /var/lib/cni/
+ iptables-save
+ grep -v CNI-
+ grep -v KUBE-
+ iptables-restore
+ which systemctl
/bin/systemctl
+ systemctl disable k3s
Removed /etc/systemd/system/multi-user.target.wants/k3s.service.
+ systemctl reset-failed k3s
+ systemctl daemon-reload
+ which rc-update
+ for unit in /etc/systemd/system/k3s*.service
+ '[' -f /etc/systemd/system/k3s.service ']'
+ rm -f /etc/systemd/system/k3s.service
+ trap remove_uninstall EXIT
+ for cmd in kubectl crictl ctr
+ '[' -L /usr/local/bin/kubectl ']'
+ rm -f /usr/local/bin/kubectl
+ for cmd in kubectl crictl ctr
+ '[' -L /usr/local/bin/crictl ']'
+ rm -f /usr/local/bin/crictl
+ for cmd in kubectl crictl ctr
+ '[' -L /usr/local/bin/ctr ']'
+ rm -f /usr/local/bin/ctr
+ '[' -d /etc/rancher/k3s ']'
+ rm -rf /etc/rancher/k3s
+ '[' -d /var/lib/rancher/k3s ']'
+ rm -rf /var/lib/rancher/k3s
+ '[' -d /var/lib/kubelet ']'
+ rm -rf /var/lib/kubelet
+ for bin in /usr/local/bin/k3s*
+ '[' -f /usr/local/bin/k3s ']'
+ rm -f /usr/local/bin/k3s
+ for bin in /usr/local/bin/k3s*
+ '[' -f /usr/local/bin/k3s-killall.sh ']'
+ rm -f /usr/local/bin/k3s-killall.sh
+ for bin in /usr/local/bin/k3s*
+ '[' -f /usr/local/bin/k3s-uninstall.sh ']'
+ rm -f /usr/local/bin/k3s-uninstall.sh
+ for bin in /usr/local/bin/k3s*
+ '[' -f /usr/local/bin/k3s-v1.17.5+k3s1 ']'
+ rm -f /usr/local/bin/k3s-v1.17.5+k3s1
+ '[' -f /usr/local/bin/k3s-killall.sh ']'
+ remove_uninstall
+ '[' -f /usr/local/bin/k3s-uninstall.sh ']'
illia@rpi:~ $ echo $?
1

Add an option to skip checks

Obviously checking people's configuration is a good idea, particularly for new users, however for experienced users they may wish to have ansible playbook execution complete faster.

Destination /usr/local/bin is not writable

Hello,
Thank you for this role.

I am folowing the most basic setup provided, and I am running into the following error:

Destination /usr/local/bin is not writable

The target is an ubuntu server:

Linux lab01 5.4.0-37-generic #41-Ubuntu SMP Wed Jun 3 18:57:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

this is the specific playbook I'm using:

- name: Rancher nodes
  hosts: rancher
  become: yes
  roles:
    - { role: xanmanning.k3s, k3s_release_version: v1.16.15+k3s1 }

Not sure what I can do about it

Installation hangs on "Check that all nodes to be ready"

Summary

Installation hangs on "Check that all nodes to be ready" with 3 server nodes and etcd.

Issue Type

  • Bug Report

Controller Environment and Configuration

k3s_become_for_all: yes
k3s_install_hard_links: yes
k3s_etcd_datastore: yes

k3s_server:
  flannel-backend: 'none'
  disable:
    - servicelb
    - traefik
    - local-storage
    - metrics-server

k3s_control_node: yes

sudo kubectl get nodes gives:

NAME                  STATUS     ROLES                       AGE   VERSION
k8s-node-1-25bc998c   NotReady   control-plane,etcd,master   25m   v1.20.2+k3s1
k8s-node-2-778e5b7e   NotReady   control-plane,etcd,master   23m   v1.20.2+k3s1
k8s-node-3-48330a77   NotReady   control-plane,etcd,master   23m   v1.20.2+k3s1

Here is the output of sudo kubectl get pods -A

NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE
kube-system   coredns-854c77959c-7k2cx   0/1     Pending   0          19m

After that I installed a CNI (Cilium) and:

NAME                  STATUS   ROLES                       AGE   VERSION
k8s-node-1-25bc998c   Ready    control-plane,etcd,master   35m   v1.20.2+k3s1
k8s-node-2-778e5b7e   Ready    control-plane,etcd,master   33m   v1.20.2+k3s1
k8s-node-3-48330a77   Ready    control-plane,etcd,master   32m   v1.20.2+k3s1

I think the role isn't detecting correctly that I disabled flannel?

Steps to Reproduce

Run the role with 3 nodes and the configuration as above.

Expected Result

The role doesn't hang.

Actual Result

Stuck on https://github.com/PyratLabs/ansible-role-k3s/blob/main/tasks/validate/state/nodes.yml

README should be more clear about "jmespath"

The README.md should be more clear about which host needs the dependency:

The control host requires the following Python dependencies:

    jmespath >= 0.9.0

"The control host" could be easily mistaken with the control plane.

Possible race condition for dqlite HA database backend

On SLES cluster (kube-0, kube-1 and kube-2), with the following config:

---

k3s_control_workers: true,
k3s_use_docker: true,
k3s_flannel_interface: "eth1",
k3s_release_version: "v1.17.3+k3s1",
k3s_control_node: true,
k3s_dqlite_datastore: true,
k3s_use_experimental: true

The task: xanmanning.k3s : Ensure secondary masters are started fails, journalctl shows the following error:

kube-1:~ # journalctl -u k3s --no-pager
-- Logs begin at Sat 2020-03-07 15:47:49 UTC, end at Sat 2020-03-07 15:55:16 UTC. --
Mar 07 15:54:03 kube-1 systemd[1]: Starting Lightweight Kubernetes...
Mar 07 15:54:03 kube-1 k3s[9407]: time="2020-03-07T15:54:03Z" level=info msg="Preparing data dir /var/lib/rancher/k3s/data/ca752b211ccbacb1b66df2ec0bc203a9511c0ec045ef5566b31c297958d46a3a"
Mar 07 15:54:05 kube-1 k3s[9407]: time="2020-03-07T15:54:05.191150631Z" level=info msg="found ip 10.10.9.3 from iface eth1"
Mar 07 15:54:05 kube-1 k3s[9407]: time="2020-03-07T15:54:05.191235405Z" level=info msg="Starting k3s v1.17.3+k3s1 (5b17a175)"
Mar 07 15:54:05 kube-1 k3s[9407]: time="2020-03-07T15:54:05.334762194Z" level=info msg="Active TLS secret  (ver=) (count 8): map[listener.cattle.io/cn-10.0.2.15:10.0.2.15 listener.cattle.io/cn-10.10.9.3:10.10.9.3 listener.cattle.io/cn-10.43.0.1:10.43.0.1 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/hash:b5b92a0a0b0cf119a6eadc6042e91dca5127949199fa2937431508633920bd15]"
Mar 07 15:54:05 kube-1 k3s[9407]: time="2020-03-07T15:54:05.391842383Z" level=info msg="Joining dqlite cluster as address=10.10.9.3:6443, id=180231"
Mar 07 15:54:05 kube-1 k3s[9407]: time="2020-03-07T15:54:05.394193094Z" level=fatal msg="starting kubernetes: preparing server: post join: a configuration change is already in progress (5)"
Mar 07 15:54:05 kube-1 systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Mar 07 15:54:05 kube-1 systemd[1]: Failed to start Lightweight Kubernetes.
Mar 07 15:54:05 kube-1 systemd[1]: k3s.service: Unit entered failed state.
Mar 07 15:54:05 kube-1 systemd[1]: k3s.service: Failed with result 'exit-code'.

It looks like kube-2 is still joining the cluster as kube-1 attempts to join, causing the service to fail.

Suggested fixes / mitigation

Perform a retry on this step that is equal to the number of play_hosts.

Check missing for size of etcd cluster

This probably should have been in the dqlite check as well, but moving on... Etcd requires a minimum of 3 members for a fault tolerance of 1 and to ensure quorum is established. There should also perhaps be a check to ensure odd numbers of cluster members are used to ensure there will always be a majority during a network partition.

https://etcd.io/docs/v3.3.12/faq/

This change isn't required for MySQL/PostgreSQL backends as a HA database service should effectively provide quorum.

v2 is missing

As Ansible is moving towards collections, I've found out in readme, that v1 version of this role is limited to older Ansible 2.9. For newest 2.10 it's suggested to use v2 role (should it be collection?). However, there's no v2 in Ansible Galaxy. The only v2 I can find is a branch here. How should i use this role with Ansible 2.10?

Token should be moved to file

Normally I'd suggest that nodes in a cluster should not have users logging in to them and be fully managed using IaC practices, I know in real life that is so very often not the case.

To avoid normal users being able to extract the k3s Shared secret used for joining a cluster, the token should be stored in a file read-only to the user running k3s (typically root) and be referenced with --token-file in the systemd unit file.

SHA256 checksum parsing exploded

I was running this role on Raspbian, and the checksum parsing threw an error after the file being downloaded.

FAILED! => {"changed": false, "cmd": "set -o pipefail && echo \"eb3796aec9fbd10adf38ecf445decf1e954397d3f1e006af251f1e22b7e720ed  k3s-airgap-images-arm.tar\n30cc1a1608c4b34c8c86f5db54bdb1da6713658981893ee18610ce170b8a12dd  k3s-armhf\n\" | grep -E ' k3s-armhf$' | awk '{ print $1 }'", "delta": "0:00:00.015042", "end": "2019-12-11 13:19:53.189650", "msg": "non-zero return code", "rc": 139, "start": "2019-12-11 13:19:53.174608", "stderr": "/bin/bash: line 2:  8110 Done                    echo \"eb3796aec9fbd10adf38ecf445decf1e954397d3f1e006af251f1e22b7e720ed  k3s-airgap-images-arm.tar\n30cc1a1608c4b34c8c86f5db54bdb1da6713658981893ee18610ce170b8a12dd  k3s-armhf\n\"\n      8111 Broken pipe             | grep -E ' k3s-armhf$'\n      8112 Segmentation fault      | awk '{ print $1 }'", "stderr_lines": ["/bin/bash: line 2:  8110 Done                    echo \"eb3796aec9fbd10adf38ecf445decf1e954397d3f1e006af251f1e22b7e720ed  k3s-airgap-images-arm.tar", "30cc1a1608c4b34c8c86f5db54bdb1da6713658981893ee18610ce170b8a12dd  k3s-armhf", "\"", "      8111 Broken pipe             | grep -E ' k3s-armhf$'", "      8112 Segmentation fault      | awk '{ print $1 }'"], "stdout": "", "stdout_lines": []}

I don't know the root cause, but using shell for this might not be the best option so I decided to fix this with some filters.

OS pre-checks: iptables

Summary

The k3s-killall.sh and k3s-uninstall.sh scripts currently require iptables and iptables-persistence (or equiv. package) to be installed, there's a few distros who have moved over to nftables which I think has caused some issues but for now it looks like k3s just supports iptables.

Some pre-checks for iptables packages being installed should exist to help out people running k3s on distros without these installed by default.

Issue Type

  • Feature Request

User Story

As a Kubernetes Operator
I want checks to against the OS for required iptables packages
So that I can ensure K3S will work as expected.

Additional Information

Ensure directories defined in k3s_server and k3s_agent exist

Summary

A number configuration options that exist in k3s_server and k3s_agent refer to directory paths, tasks to ensure that these directories exist should also be present in this role.

Perhaps a variable to configure whether or not this role should create them or not.

Issue Type

  • Feature Request

User Story

As a Kubernetes Operator
I want this role to create any required directories for me
So that I don't have to manually create them.

Using Calico Fails to Complete Install

How do you handle the "Wait for all nodes to be ready" phase when not using Flannel? I believe I need to load calico via kubectl, but the install will fail here prior to getting to a point where I can run the kubectl load command:

- name: Wait for all nodes to be ready
command: "{{ k3s_install_dir }}/kubectl get nodes"
changed_when: false
failed_when: false
register: kubectl_get_nodes_result
until: kubectl_get_nodes_result.rc == 0
and kubectl_get_nodes_result.stdout.find("NotReady") == -1
retries: 30
delay: 20
when: k3s_control_node

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.