GithubHelp home page GithubHelp logo

budimanjojo / talhelper Goto Github PK

View Code? Open in Web Editor NEW
238.0 7.0 13.0 4.36 MB

A tool to help creating Talos kubernetes cluster

Home Page: https://budimanjojo.github.io/talhelper

License: BSD 3-Clause "New" or "Revised" License

Go 97.15% Nix 1.40% Dockerfile 0.51% Shell 0.93%
kubernetes talos talosctl gitops k8s-at-home

talhelper's Introduction

Talhelper

GitHub release (release name instead of tag name) GitHub issues License AUR link

A helper tool to help creating Talos cluster in your GitOps repository.

It's like Kustomize but for Talos manifest files with SOPS support natively.

· Report Bug · Request Feature

About The Project

The main reason of this tool is to help creating Talos cluster in GitOps way. Inspired by a python script written by @bjw-s.

You can use this tool to generate Talos config file with talhelper genconfig command. You can also use this tool to generate Talos secrets with talhelper gensecret command.

To get started, read the documentation.

License

Distributed under the BSD-3 License. See LICENSE for more information.

Sponsors

Thanks for those who are sponsoring or have sponsored my projects, it helps and motivates me to do more!

Acknowledgments

  • Talos <- Obviously, for providing the code and everything
  • bjw-s <- The guy who inspired this tool
  • k8s@home <- Best community of people running Kubernetes at home

talhelper's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

talhelper's Issues

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

github-actions
.github/workflows/golangci-lint.yaml
  • actions/setup-go v4
  • actions/checkout v3
  • golangci/golangci-lint-action v3
.github/workflows/release.yaml
  • actions/setup-go v4
  • actions/checkout v3
  • goreleaser/goreleaser-action v4
.github/workflows/renovate.yaml
  • actions/checkout v3
  • tibdex/github-app-token v1
  • renovatebot/github-action v39.0.3
.github/workflows/test.yaml
  • actions/setup-go v4
  • actions/checkout v3
gomod
go.mod
  • go 1.21
  • github.com/a8m/envsubst v1.4.2
  • github.com/evanphx/json-patch v5.6.0+incompatible
  • github.com/fatih/color v1.15.0
  • github.com/ghodss/yaml v1.0.0
  • github.com/gookit/validate v1.5.1
  • github.com/joho/godotenv v1.5.1
  • github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06@525f6e181f06
  • github.com/siderolabs/crypto v0.4.1
  • github.com/siderolabs/net v0.4.0
  • github.com/siderolabs/talos/pkg/machinery v1.6.0-alpha.0
  • github.com/spf13/cobra v1.7.0
  • go.mozilla.org/sops/v3 v3.7.3
  • golang.org/x/mod v0.12.0
  • gopkg.in/yaml.v3 v3.0.1
  • sigs.k8s.io/yaml v1.3.0
regex
pkg/config/defaults.go
  • siderolabs/talos v1.5.1
pkg/config/defaults.go
  • siderolabs/talos v1.5.1

filename does not contain clustername

it looks like the clustername is missing in the config filenames?

➜  talos talhelper genconfig
generated config for master1 in ./clusterconfig/-master1.yaml
generated config for master2 in ./clusterconfig/-master2.yaml
generated config for master3 in ./clusterconfig/-master3.yaml
generated config for worker1 in ./clusterconfig/-worker1.yaml
generated client config in ./clusterconfig/talosconfig
generated .gitignore file in ./clusterconfig/.gitignore
➜  talos cat talconfig.yaml 
  clustername: mycluster
  talosVersion: v1.5.2
  endpoint: https://10.0.30.80:6443
  nodes:
    - hostname: master1
      ipAddress: 10.0.30.81
      installDisk: /dev/sda
      controlPlane: true
    - hostname: master2
      ipAddress: 10.0.30.82
      installDisk: /dev/sda
      controlPlane: true
    - hostname: master3
      ipAddress: 10.0.30.83
      installDisk: /dev/sda
      controlPlane: true
    - hostname: worker1
      ipAddress: 10.0.30.84
      installDisk: /dev/nvme1
      controlPlane: false
➜  talos  talhelper --version
talhelper version 1.12.0

Add schematic `extraKernelArgs` into `machine.install.extraKernelArgs` automatically

Right now, the extraKernelArgs added into schematic is not being respected if you apply the configuration to your existing cluster. See siderolabs/talos#8008 for more information.

I think it makes sense for talhelper to add schematic.customization.extraKernelArgs to machine.install.extraKernelArgs automatically so the kernel arguments are being applied to the machine because you should always expect them to be there if you define them in your talconfig.yaml.
But there's a small problem because you might already have a patch like this in your talconfig.yaml:

nodes:
  - hostname: host1
    ipAddress: 1.2.3.4
    patches:
      - op: add
        path: /machine/install/extraKernelArgs
        value:
          - talos.logging.kernel=udp://1.2.3.4:1234

And this will make the list become overwritten.
This makes this not consistent but it's what it's I think.

[feat] add talosctl health to gencommand

talosctl health has a quite specific syntaxsis, requiring workers and controlplane nodes to be specified seperatly.

This is a great oppertunity for deploying talhelper gencommand in my honest opinion!

System extensions example

It would be great to have some example how to enable system extensions like iscsi. I tried the extensions parameter in the node object but got a deprecation warning in talosctl.

What is the correct way to that?

Add `nodes` to the generated `talosconfig` file

Ref: #344 (comment)

Aside from adding controlplane nodes IP addresses to the contexts.<context>.endpoints in the generated talosconfig, should also maybe add all nodes IP addresses to the contexts.<context>.nodes too.

This is something that I personally do in my talosconfig and I can't remember any reason why I didn't do it in talhelper although I did decided to not do this in the end.

This is what I found so far about having it inside talosconfig in the official docs (https://www.talos.dev/v1.6/learn-more/talosctl/#endpoints-and-nodes):

The node is the target node on which you wish to perform the API call. While you can configure the target node (or even set of target nodes) inside the ’talosctl’ configuration file, it is recommended not to do so, but to explicitly declare the target node(s) using the -n or --nodes command-line parameter.

Prepare for the deprecation of system extensions through config file

With the deprecation of installing system extensions through config file, I made a change (the easiest way to get it done) by allowing per node installer image with nodes[].talosImageURL (6c09821).

But I think I should prepare for Talos v1.6 which will highly likely to remove machine.install.extensions completely and have everybody using their image-factory to generate installer url (I need to confirm with people from Talos to make sure).

My current thought is to change the default installer image url from ghcr.io/siderolabs/installer to factory.talos.dev/installer/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba where the hash id is coming from the customization that we give it. I will add nodes[].schematic.customization so you can do something like this:

nodes:
  - hostname: node1
    schematic:
      customization:
        extraKernelArgs:
          - net.ifnames=0
        systemExtensions:
          officialExtensions:
            - siderolabs/intel-ucode

and the generated manifest will have machine.install.image value to be factory.talos.dev/installer/9e8cc193609699825d61c039c7738d81cf29c7b20f2a91d8f5c540511b9ea0b4:v1.5.4.

You can still provide nodes[].talosImageURL to override it with your own image in case you don't want to use the image-factory or you build your own image.

This will also mean that providing nodes[].extensions will be ignored so this will be a breaking change, I'll make this verbose when the program detects it.

Please provide your feedback below before I start working on this.

Update: I decided to not ignore nodes[].extensions yet so this is not a breaking change. But the program will show a warning if it's being used.

Bug: talhelper fails to parse `talenv.yaml` if the yaml document delimiter is present

It seems that talhelper refuses to parse a talenv.yaml file if the yaml document delimiter is present:

mike@home-ops-devbox:/workspace/home-ops/infra/home-cluster$ ls -l
total 24
drwxr-xr-x 1 mike mike 4096 Nov 21 11:08 clusterconfig
drwxr-xr-x 1 mike mike 4096 Nov 19 18:38 patches
-rw-r--r-- 1 mike mike 3204 Nov 21 11:06 talconfig.yaml
-rw-r--r-- 1 mike mike   97 Nov 21 11:07 talenv.yaml
-rw-r--r-- 1 mike mike 8970 Nov 19 18:38 talsecret.sops.yaml
-rw-r--r-- 1 mike mike 4841 Nov 19 19:42 talsecret.yaml
mike@home-ops-devbox:/workspace/home-ops/infra/home-cluster$ cat talenv.yaml 
---
clusterName: home-ops
clusterEndpointIP: 10.0.10.10
clusterNetworkDomain: k8s.mirceanton.com
mike@home-ops-devbox:/workspace/home-ops/infra/home-cluster$ talhelper genconfig 
2023/11/21 11:09:28 failed to substitute env: variable ${clusterName} not set

But if I remove the --- at the top of the file:

mike@home-ops-devbox:/workspace/home-ops/infra/home-cluster$ cat talenv.yaml 
clusterName: home-ops
clusterEndpointIP: 10.0.10.10
clusterNetworkDomain: k8s.mirceanton.com
mike@home-ops-devbox:/workspace/home-ops/infra/home-cluster$ talhelper genconfig
generated config for nuc.home-ops.k8s.mirceanton.com in ./clusterconfig/home-ops-nuc.home-ops.k8s.mirceanton.com.yaml
generated config for srv.home-ops.k8s.mirceanton.com in ./clusterconfig/home-ops-srv.home-ops.k8s.mirceanton.com.yaml
generated config for minisforum.home-ops.k8s.mirceanton.com in ./clusterconfig/home-ops-minisforum.home-ops.k8s.mirceanton.com.yaml
generated client config in ./clusterconfig/talosconfig
generated .gitignore file in ./clusterconfig/.gitignore
mike@home-ops-devbox:/workspace/home-ops/infra/home-cluster$ 

This is not really a problem if the file is sops-encrypted, as sops automatically removes that delimiter, but it threw me off a bit initially.

Simplify applying the generated configuration

Currently, the workflow to bootstrap a Talos cluster using talhelper is to generate the per-node configuration files using talhlper genconfig and then manually writing the talosctl apply commands, pointing it to the right config file and the right node.

Given that the talconfig.yaml already contains all the required information in order to apply the config as well, I think that talhelper should have a subcommand which simplifies this process.

This could either be a command that runs the talosctl apply under the hood or a command that generates the talosctl apply command so we can pipe it into bash.

Based on the discussions in issue #211 and in the k8s@home discord, we decided that if this is to be implemented, it will most likely be the 2nd option, allowing us to pipe said output.

Expected workflow:

With the following talconfig.yaml:

---
clusterName: home-cluster
talosVersion: v1.5.5
kubernetesVersion: v1.28.2
nodes:
  - hostname: cp-01.dev.k8s.mirceanton.com
    controlPlane: true
    ipAddress: 10.0.10

I should be able to do something along the lines of:

root@devbox:/workspace# talhelper gencommand --apply
talosctl apply-config --talosconfig ./clusterconfig/talosconfig --nodes 10.0.10.195 --file ./clusterconfig/home-cluster-cp-01.dev.k8s.mirceanton.com.yaml --insecure;

Thus allowing me to do something like this to bootstrap my cluster:

talhelper genconfig;
talhelper gencommand --apply | bash

For a single node this may seem trivial, but for clusters with multiple nodes, it easily becomes a quality of life improvement!

[Feature request] gencommand for (soft) resetting nodes

Hi @budimanjojo,

Still loving you tool very much!
I would like to pitch a feature request to generate commands to reset my nodes. Personally I like to reset my nodes back to maintenance state, so I can redeploy quickly. I do that by only wiping the STATE and EPHEMERAL and rebooting the node afterwards.
I choose to not wait on completion and to do an ungraceful reset (since I'm nuking the whole cluster)

Example:
talosctl reset --reboot --system-labels-to-wipe STATE --system-labels-to-wipe EPHEMERAL --graceful=false --wait=false --context talos-dev -n 10.0.88.1

However, I need to do it node by node now in a script. I rather use something like:
talhelper gencommand reset

Get the idea? What do you think?

talhelper genurl -n only works for ip address

talheler genurl installer will return a list of URLs with hostnames, but using -n with said hostname doesn't work, it seems to expect the ip of the node. Could it work for the hostname as well?

Support generating single file for all nodes

Currently, IP address of controlplane node is being appended to the generated talosconfig file.
And hostname of each node is being used to set the machine.network.hostname in the generated node config file.

There's a use case where you want to have one config file for all your nodes. I.e you're using DHCP server to manage the hostname of your nodes, and using PXE boot to load Talos into your nodes.
With combination of all your nodes being identical hardware wise, this is something that has been done by @onedr0p (https://github.com/onedr0p/home-ops/tree/cd103c4543cd177ca6e0d6ba06d4a420cef42834/kubernetes/main/bootstrap)

To support this use case, I'm thinking about adding overrideClientEndpoints (default to empty slice to not break current behavior) so you can provide your own endpoints in the generated talosconfig.
And then also adding per node ignoreHostname (default to false to not break current behavior) so there's no node specific key in the generated config file.

This will allow you to do something like this:

  • You want to create config file for a 6 nodes (3 controlplane nodes with IP of 1.1.1.1-1.1.1.3 and 3 worker nodes with IP of 1.1.1.4-1.1.1.6).
  • All of the nodes are of the same hardware specs.
  • The endpoint is a VIP of 1.1.1.10.

Your talconfig.yaml file should looks something like this:

---
clusterName: main
endpoint: https://1.1.1.10:6443
overrideClientEndpoints:
  - 1.1.1.1
  - 1.1.1.2
  - 1.1.1.3
nodes:
  - hostname: controlplane
    ipAddress: 1.1.1.1 # this is needed although not being used. you'll still use this to do other `talhelper` commands like `gencommand`
    controlPlane: true
    ignoreHostname: true
    ...
  - hostname: worker
    ipAddress: 1.1.1.4 # not needed like above
    controlPlane: false
    ignoreHostname: true

This will create a structure like this in the current working directory:

clusterconfig/
  - main-controlplane.yaml
  - main-worker.yaml
  - talosconfig

The talosconfig will look something like this:

context: main
contexts:
  main:
    endpoints:
      - 1.1.1.1
      - 1.1.1.2
      - 1.1.1.3
...

And there will be no machine.network.hostname field in the generated main-controlplane.yaml and main-worker.yaml.

You can then do talosctl apply-config command to all your nodes.

failed to generate talos config: missing kind

I am reorganizing my talhelper configs by splitting up patches in their dedicated files to be called from the main files but I am getting the following error: failed to generate talos config: missing kind

There is no new configuration done, just moving existing patches to files.

talconfig.yaml

...
  - hostname: k8s-05
    ipAddress: 10.0.20.205
    controlPlane: false
    installDisk: /dev/nvme0n1
    nameservers:
      - 10.0.20.1
    networkInterfaces:
      - interface: eth0
        mtu: 1500
        addresses:
          - 10.0.20.205/24
        routes:
          - network: 0.0.0.0/0
            gateway: 10.0.20.1
    patches:
      - |-
        - op: add
          path: /machine/install/extraKernelArgs
          value:
            - nomodeset
        - "@./intel-ucode-patch.yaml"
...

intel-ucode-patch.yaml

- op: add
  path: /machine/install/extensions
  value:
    - image: ghcr.io/siderolabs/intel-ucode:20221108

VIP possible ?

Heylo, just found this awesome tool and wanted to test this on one of my clusters, but I am failing to apply a VIP configuration with the custom patch method:

talhelper genconfig
2023/02/26 03:51:13 failed to generate talos config: unknown keys found during decoding:
machine:
    network:
        vip:
            ip: 192.168.1.10

with a config of

clusterName: artemis
talosVersion: v1.3.5
kubernetesVersion: v1.26.0
endpoint: https://192.168.1.10:6443
domain: artemis.local
allowSchedulingOnMasters: true
additionalMachineCertSans:
  - 192.168.1.10
additionalApiServerCertSans:
  - artemis.local
clusterPodNets:
  - 10.244.0.0/16
clusterSvcNets:
  - 10.96.0.0/12
nodes:
  - hostname: master1
    ipAddress: 192.168.1.11
    installDisk: /dev/sda
    controlPlane: true
    disableSearchDomain: true
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
    networkInterfaces:
      - interface: eth0
        addresses:
          - 192.168.1.11/16
        routes:
          - network: 0.0.0.0/0
            gateway: 192.168.178.1
        mtu: 1500
        dhcp: true
  - hostname: master2
    ipAddress: 192.168.1.12
    installDisk: /dev/sda
    controlPlane: true
    disableSearchDomain: true
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
    networkInterfaces:
      - interface: eth0
        addresses:
          - 192.168.1.12/16
        routes:
          - network: 0.0.0.0/0
            gateway: 192.168.178.1
        mtu: 1500
        dhcp: true
  - hostname: master3
    ipAddress: 192.168.1.13
    installDisk: /dev/sda
    controlPlane: true
    disableSearchDomain: true
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
    networkInterfaces:
      - interface: eth0
        addresses:
          - 192.168.1.13/16
        routes:
          - network: 0.0.0.0/0
            gateway: 192.168.178.1
        mtu: 1500
        dhcp: true
controlPlane:
  patches:
    - |-
      - op: add
        path: /machine/network/vip
        value:
          ip: 192.168.1.10

I thinks its just the validation library which is complaining and/or is missing this feature? Didn't really investigate yet. If I have any obvious errors please let me know :-)

Config leaks between nodes

Hi! I noticed that the node attribute MachineDisks was applying to nodes that didn't have it configured.

You can test it on the example provided in the repo running go run ./main.go genconfig -c ./example/talconfig.yaml -o ./example/clusterconfig

# home-cluster-kworker1.yaml
  disks:
    - device: /dev/disk/by-id/ata-CT500MX500SSD1_2149E5EC1D9D
      partitions:
        - mountpoint: /var/mnt/sata

Looking at the code, I think this is the problem, nodeInput is reused while iterating over nodes so anything that overrides the config conditionally keeps the old value:

if node.InstallDisk != "" {
nodeInput.Options.InstallDisk = node.InstallDisk
}
if len(node.MachineDisks) > 0 {
nodeInput.Options.MachineDisks = node.MachineDisks
}

Apply patch to all nodes

Currently it's only possible to apply patches to either controlplane nodes or worker nodes. Would it be possible to apply patches to all nodes without having the same code in the talconfig.yaml twice? (Or referencing patch files in 2 places / using YAML anchors)
Talosctl provides this possibility with the option to use a patch for all nodes with the --config-patch option.

Reference: https://www.talos.dev/v1.5/talos-guides/configuration/patching/#configuration-patching-with-talosctl-cli

Gencommand to get kubeconfig

Hi @budimanjojo

Would it be possible to also extend the gencommand for fetching the kubeconfig?
That way we can use talhelper end-2-end from creating config till bootstrapping the cluster.

overridePatches does not work as described

my config looks like this:

talosVersion: v1.6.7
endpoint: https://10.0.30.80:6443
additionalApiServerCertSans:
- home-cluster-dev.local
- k8s-dev.home.foo.bar
#- 10.0.30.80
#  - 10.0.30.81
#  - 10.0.30.82
#  - 10.0.30.83
additionalMachineCertSans:
  - 10.0.30.81
  - 10.0.30.82
  - 10.0.30.83
allowSchedulingOnMasters: true
nodes:
  - hostname: master1
    ipAddress: 10.0.30.81
    installDisk: /dev/sda
    controlPlane: true
    disableSearchDomain: true
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
      - 8.8.4.4
    networkInterfaces:
      - interface: enp0s3
      #   addresses:
      #     - 10.0.30.81/24
      #   routes:
      #     - network: 0.0.0.0/0
      #       gateway: 10.0.30.1
      #   mtu: 1500
        dhcp: true
        vip:
          ip: 10.0.30.80
    schematic:
      customization:
        systemExtensions:
          officialExtensions:
            - siderolabs/gasket-driver
            - siderolabs/intel-ucode
            - siderolabs/iscsi-tools
            - siderolabs/util-linux-tools
            - siderolabs/zfs
  - hostname: master2
    ipAddress: 10.0.30.82
    installDisk: /dev/sda
    controlPlane: true
    disableSearchDomain: true
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
      - 8.8.4.4
    networkInterfaces:
      - interface: enp0s3
        # addresses:
        #   - 10.0.30.82/24
        # routes:
        #   - network: 0.0.0.0/0
        #     gateway: 10.0.30.1
        # mtu: 1500
        dhcp: true
        vip:
          ip: 10.0.30.80
    schematic:
      customization:
        systemExtensions:
          officialExtensions:
            - siderolabs/gasket-driver
            - siderolabs/intel-ucode
            - siderolabs/iscsi-tools
            - siderolabs/util-linux-tools
            - siderolabs/zfs
  - hostname: master3
    ipAddress: 10.0.30.83
    installDisk: /dev/sda
    controlPlane: true
    disableSearchDomain: true
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
      - 8.8.4.4
    networkInterfaces:
      - interface: enp0s3
        # addresses:
        #   - 10.0.30.83/24
        # routes:
        #   - network: 0.0.0.0/0
        #     gateway: 10.0.30.1
        # mtu: 1500
        dhcp: true
        vip:
          ip: 10.0.30.80
    schematic:
      customization:
        systemExtensions:
          officialExtensions:
            - siderolabs/gasket-driver
            - siderolabs/intel-ucode
            - siderolabs/iscsi-tools
            - siderolabs/util-linux-tools
            - siderolabs/zfs
    overridePatches: false
    patches:
      - |-
        - op: add
          path: /machine/kubelet/extraMounts
          value:
            - destination: /var/mnt/tank
              type: bind
              source: /var/mnt/tank
              options:
                - bind
                - rshared
                - rw
  # - hostname: worker1
  #   ipAddress: 10.0.30.84
  #   installDisk: /dev/nvme1
  #   controlPlane: false
controlPlane:
  kernelModules:
    - name: zfs
  patches:
    - |-
      - op: add
        path: /cluster/proxy/extraArgs
        value:
          metrics-bind-address: "0.0.0.0:10249"
      - op: add
        path: /machine/kubelet/extraArgs
        value:
          feature-gates: GracefulNodeShutdown=true
          rotate-server-certificates: "true"
      - op: add
        path: /machine/kubelet/extraMounts
        value:
          - destination: /var/lib/longhorn
            type: bind
            source: /var/lib/longhorn
            options:
              - bind
              - rshared
              - rw
          - destination: /var/lib/local-path-provisioner
            type: bind
            source: /var/lib/local-path-provisioner
            options:
              - bind
              - rshared
              - rw
      - op: add
        path: /cluster/extraManifests
        value:
          - https://raw.githubusercontent.com/alex1989hu/kubelet-serving-cert-approver/main/deploy/standalone-install.yaml
          - https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
worker:
  kernelModules:
    - name: zfs
  patches:
    - |-
      - op: add
        path: /machine/kubelet/extraArgs
        value:
          feature-gates: GracefulNodeShutdown=false
          rotate-server-certificates: "true"
      - op: add
        path: /machine/kubelet/extraMounts
        value:
          - destination: /var/lib/longhorn
            type: bind
            source: /var/lib/longhorn
            options:
              - bind
              - rshared
              - rw
          - destination: /var/lib/local-path-provisioner
            type: bind
            source: /var/lib/local-path-provisioner
            options:
              - bind
              - rshared
              - rw



Based on the docs my idea would be that master 3 would get an extra mount. But master3 looks like all other nodes.

One idea is that /machine/kubelet/extraMounts is wrong, but /machine/kubelet/extraMounts/- gives me that error message:
2024/03/23 14:25:19 failed to generate talos config: failure applying rfc6902 patches to talos machine config: add operation does not apply: doc is missing path: "/machine/kubelet/extraMounts/-": missing value

talhelper gencommand upgrade isn't SecureBoot compatible / doesn't use Schematic ID

I'm trying to upgrade my cluster to Talos 1.6.2 with talhelper 1.17
I would like to use the talhelper gencommand upgrade command which generates upgrades commands for standard Talos Linux iso pointing just fine.

However, I'm using different Schematic ID (97bf8e92fc6bba0f03928b859c08295d7615737b29db06a97be51dc63004e403) which refers to:

customization:
    systemExtensions:
        officialExtensions:
            - siderolabs/i915-ucode
            - siderolabs/intel-ucode

Also I'm using SecureBoot ISO's. Which should refer to different URL to fetch from the factory:
factory.talos.dev/installer-secureboot/97bf8e92fc6bba0f03928b859c08295d7615737b29db06a97be51dc63004e403 (installer-sercureboot instead of installer)

Is there a way to look at the talosImageURL referenced in the talconfig.yaml to achieve this?

genconfig generates new certificates each time

When building the cluster initially and then running talhelper genconfig it all works fine.

If you delete the talosconfig generated and run talhelper genconfig again you get a different ca, crt and key from the same configuration.

Reproduction steps:

  1. Have a talconfig.yaml, talenv.sops.yaml and talsecrets.sops.yaml
  2. Run talhelper genconfig --env-file talenv.sops.yaml --secret-file talsecret.sops.yaml --config-file talconfig.yaml
  3. Copy the clusterconfig folder
  4. Delete the clusterconfig foler
  5. Run Step 2 again
  6. Run diff clusterconfig/talosconfig /tmp/old/talosconfig

They are different, making me lose access to Talos API

talosctl genconfig complains that amdgpu-firmware is not a supported talos extension

Defined amdgpu-firmware in my schematic extension block:

schematic: &schematic
  customization:
    systemExtensions:
      officialExtensions:
        - siderolabs/amdgpu-firmware

When talosctl genconfig is executed, the following error is displayed:

There are issues with your talhelper config file:
field: "nodes[0].schematic"
  * "siderolabs/amdgpu-firmware" is not a supported Talos extension

It is listed under Firmware in the Extension Catalog
https://github.com/siderolabs/extensions

Improve validation for version string

Using a beta tag for talosVersion like this:

talosVersion: v1.4.0-beta.1

will error out talhelper with this:

✖ talhelper genconfig 
There are issues with your talhelper config file:
- TalosVersion field did not pass validation

even if that is a valid talos version.

Current way of validating Talos and Kubernetes version is from this function call:

func (c Config) IsVersion(version string) bool {

that use regex to match version number.

The appropriate way of validating Talos version should be using their validator but it seems like they currently don't have that and they're downloading the image from registry and error out when it returns error response code. I don't want talhelper to require internet connection to validate a version number.

Therefore, my current idea is to improve the version checking with the official semver regex instead that can be viewed here: https://semver.org/#is-there-a-suggested-regular-expression-regex-to-check-a-semver-string

Generate per node factory url for `genurl` commands

Right now, talhelper genurl iso and talhelper genurl installer doesn't actually generate the url based on what you put in talconfig.yaml file. It makes more sense if the generated urls can be taken from config file. Let's say I have this block inside my talconfig.yaml:

talosVersion: v1.5.5
nodes:
  - hostname: machine1
    schematic:
       customization:
         extraKernelArgs:
           - net.ifnames=0
         systemExtensions:
           officialExtensions:
             - siderolabs/intel-ucode
             - siderolabs/tailscale
  - hostname: machine2

And when I run talhelper genurl installer from a path where the talconfig.yaml is found, it should return something like this:

machine1: factory.talos.dev/installer/e019fb5f6bf0a685e805a1263323cf8c38feefdc5a4cecbefbd5ba58e4383a62:v1.5.5
machine2: factory.talos.dev/installer/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba:v1.5.5

But, when I run the talhelper genurl installer where there's no talconfig.yaml found, it shouldn't error out. Instead return something like this:

factory.talos.dev/installer/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba:v1.5.5

This change will affect --extension and --kernel-args flags to be ignored when talconfig.yaml is found because you want to use the ones defined in the config file instead.

This will make upgrading Talos using talosctl upgrade so much easier, i.e you want to update machine1, you will run:

talosctl upgrade -n <machine1> --image factory.talos.dev/installer/e019fb5f6bf0a685e805a1263323cf8c38feefdc5a4cecbefbd5ba58e4383a62:v1.5.5

where the --image is gotten from the talhelper genurl installer command above.

nodes[*].installDiskSelector.size should not be required

nodes[*].installDiskSelector.size is not actually required in the talos config.

I tried using the size reported by talosctl disks, but that doesn't work (that's an issue with talos) -- it's probably a rounding error. For example, talosctl disks returns a size of 128 GB, but talosctl dry-run -f node1.yaml fails with "no disk found matching provided parameters".

As a workaround, switching the size to <= 128.1 GB makes everything work.

Improve validation for Supported Talos version

Before this commit (6396cd7), check for supported Talos version is being done by just validating the MajorMinor of the talosVersion field is in the range of https://budimanjojo.github.io/talhelper/latest/reference/supported-version/.

But after the commit, the check validates if the exact version number is in the https://github.com/budimanjojo/talhelper/blob/master/pkg/config/schemas/talos-extensions.json. This was done due to the addition of Schematic on Talos side.
This will make sure that the check for node schematic

func checkNodeSchematic(node Node, idx int, talosVersion string, result *Errors) *Errors {
can be done correctly.

This is working fine as expected. But there is a downside: If Talos release a new version and Talhelper haven't release the update that include the latest talos-extensions.json file or you haven't update talhelper yet, then you'll get false error "vx.x.x" is not a supported Talos version.

Falling back to the old behavior is not a good option here, because it will make the checking for correct systemExtensions for that version unrealistic. Right now it will just fall back to "v1.5.5"

if !OfficialExtensions.Contains(talosVersion) {
// fallback to v1.5.5
talosVersion = "v1.5.5"

But at least it will be better than needing to wait for me to release a new version.

My plan is to make this a warning instead of error so you can still generate the configs. Why should there be any warning or error at all? Because the system extensions that you put inside might not work properly if the version is not yet included (this is built into the program on release) in the talhelper binary you're using, for example v1.6.0 supports 30 system extensions while v1.5.5 only supports 19. I think I should also fallback to latest stable version instead of v1.5.5 for the schematic validation too.

If anyone has a better idea or take on this, feel free to comment here. Thanks before.

[feat] Define bootstrap node and allow generating bootstrap command

With 2 out of 3 most important commands done (apply and upgrade respectively), it would be nice if we could also generate the bootstrap command.

What we need for that, is to allow a single(!) controlplane node to be flagged as bootstrap: true (or something comparable).
Then we could use its IP to generate a bootstrap command as well.

panic: runtime error: invalid memory address or nil pointer dereference

Version info

talhelper version 1.16.1 (installed via brew)
Ubuntu 22.04.3 LTS (bare metal)

Entered within command line environment (one or the other does not matter):
talhelper genconfig --env-file talenv.sops.yaml --secret-file talsecret.sops.yaml --config-file talconfig.yaml
or
talhelper genconfig

Results in (unfortunately):

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xed39a3]

goroutine 1 [running]:
github.com/budimanjojo/talhelper/pkg/talos.applyNodeOverride(0xc0001409a0, {0x1374bf8, 0xc00048bb40})
        github.com/budimanjojo/talhelper/pkg/talos/nodeconfig.go:99 +0x423
github.com/budimanjojo/talhelper/pkg/talos.GenerateNodeConfig(0xc0001409a0, 0xc000130a80, 0xc00048c3f0?, 0x2f?)
        github.com/budimanjojo/talhelper/pkg/talos/nodeconfig.go:44 +0x134
github.com/budimanjojo/talhelper/pkg/talos.GenerateNodeConfigBytes(0xc0001a0c00?, 0x11baffe?, 0x13?, 0x6c?)
        github.com/budimanjojo/talhelper/pkg/talos/nodeconfig.go:15 +0x13
github.com/budimanjojo/talhelper/pkg/generate.GenerateConfig(0xc0001a0c00, 0x0, {0x11baffe, 0xf}, {0xc000158198?, 0x0?}, {0x11aefb0, 0x5}, 0x0?)
        github.com/budimanjojo/talhelper/pkg/generate/config.go:32 +0x31e
github.com/budimanjojo/talhelper/cmd.glob..func5(0xc00019ca00?, {0xc000114600?, 0x4?, 0x11ae663?})
        github.com/budimanjojo/talhelper/cmd/genconfig.go:46 +0x22c
github.com/spf13/cobra.(*Command).execute(0x1b65900, {0xc0001145a0, 0x6, 0x6})
        github.com/spf13/[email protected]/command.go:987 +0xaa3
github.com/spf13/cobra.(*Command).ExecuteC(0x1b66a40)
        github.com/spf13/[email protected]/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/[email protected]/command.go:1039
github.com/budimanjojo/talhelper/cmd.Execute(...)
        github.com/budimanjojo/talhelper/cmd/root.go:53
main.main()
        github.com/budimanjojo/talhelper/main.go:10 +0x1b

Have a `--debug` flag to show more information

It would be nice to show more information about what the program is doing, for example the current config file being loaded, what environment variables are being set, etc.

This should be done with --debug flag which is disabled by default.

missing core components

Currently, the talhelper generated config do not contain controller-manager and scheduler images, talhelper show certain configs as "{}". And I could not patch it.

You can re-produce this issue by generate default config using talosctl and talhelper.

Update documentation with common workflows

The documentation seems fine for getting up and running. What's missing is how to supply follow on changes to the configuration, how to apply them to server, etc.

If this should just go to talosctl then maybe make a reference to that documentation.

SImplify Talos upgrades

Currently, the workflow to upgrade a Talos cluster involves fetching the right installer image from the Talos factory and then making sure to specify it in the talosctl upgrade command.

The genurl subcommand definitely helps in trimming down the process, but I believe this could be improved further. Since the talconfig.yaml already contains the system extensions for each node, the kernel params and the talos version, I propose a new subcommand which would print out the entire talosctl upgrade command.

Similar to #214 , I think we should be able to do something like this:

root@talhelper-devbox:/workspace# talhelper gencommand --upgrade
talosctl upgrade --talosconfig ./clusterconfig/talosconfig --nodes 10.0.10.195 --image factory.talos.dev/<image-hash>:<talos-version>;

So that I can simply do this to upgrade my entire cluster:

talhelper gencommand --upgrade | bash

Plan for the upcoming upstream gen secret subcommand

Following the merged PR in the upstream talos: siderolabs/talos#5870. It's really a great idea that I didn't think of at the time I do this. Now I have some ideas about how this should be implemented. Which approach should I go?

  1. Have talhelper gensecret to output a yaml encoded data like talosctl gen secret. talhelper genconfig will read a file called talsecret.yaml if it exists to generate the manifests. This will preserve the current talenv.yaml and no breaking change. The talhelper gensecret --path-configfile flag will be deleted though.
  2. Have talhelper gensecret to output a yaml encoded data like talosctl gen secret. talhelper gensecret --patch-envfile will patch the talenv.yaml file a new field secretBundle that contains the mapping of the secret. talhelper genconfig will generate the manifests with those secrets if it's specified and this shouldn't be a breaking change too. The talhelper gensecret --patch-configfile flag will be deleted too.
  3. Ignore them and just continue doing it the current way.

I will begin working when the talosctl version with this PR is released and when I have decided which route to go.

Renovate Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Repository problems

Renovate tried to run on this repository, but found these problems.

  • WARN: Found renovate config warnings

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

devcontainer
.devcontainer/devcontainer.json
  • mcr.microsoft.com/devcontainers/go 1-1.21-bookworm
  • ghcr.io/devcontainers/features/docker-in-docker 2
  • ghcr.io/devcontainers-contrib/features/mkdocs 2
  • ghcr.io/devcontainers-contrib/features/sops 1
  • ghcr.io/devcontainers-contrib/features/age 1
  • ghcr.io/devcontainers-contrib/features/age-keygen 1
dockerfile
Dockerfile
  • golang 1.22.3-alpine3.18
github-actions
.github/workflows/deploy-docs.yaml
  • actions/checkout v4
  • cachix/install-nix-action v27
  • workflow/nix-shell-action v3.3.2
.github/workflows/docker-build-and-push.yaml
  • actions/checkout v4.1.7@692973e3d937129bcbf40652eb9f2f61becf3332
  • docker/setup-qemu-action v3
  • docker/setup-buildx-action v3
  • docker/login-action v3
  • docker/metadata-action v5
  • docker/build-push-action v5
  • actions/upload-artifact v4
  • actions/download-artifact v4
  • docker/setup-buildx-action v3
  • docker/metadata-action v5
  • docker/login-action v3
.github/workflows/generate-cli-ref-doc.yaml
  • actions/checkout v4
  • actions/setup-go v5
  • tibdex/github-app-token v2
  • peter-evans/create-pull-request v6
  • peter-evans/enable-pull-request-automerge v3
.github/workflows/golangci-lint.yaml
  • actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
  • actions/setup-go v5
  • golangci/golangci-lint-action v6
.github/workflows/release.yaml
  • actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
  • actions/setup-go v5
  • goreleaser/goreleaser-action v5
.github/workflows/renovate.yaml
  • actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
  • tibdex/github-app-token v2
  • renovatebot/github-action v40.1.12
.github/workflows/test.yaml
  • actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
  • actions/setup-go v5
.github/workflows/update-extensions-schema.yaml
  • actions/checkout v4
  • actions/setup-go v5
  • actions/cache v4
  • tibdex/github-app-token v2
  • peter-evans/create-pull-request v6
  • peter-evans/enable-pull-request-automerge v3
.github/workflows/update-flake.yaml
  • actions/checkout v4
  • cachix/install-nix-action v27
  • workflow/nix-shell-action v3.3.2
  • tibdex/github-app-token v2
  • peter-evans/create-pull-request v6
.github/workflows/update-json-schema.yaml
  • actions/checkout v4
  • actions/setup-go v5
  • tibdex/github-app-token v2
  • peter-evans/create-pull-request v6
  • peter-evans/enable-pull-request-automerge v3
gomod
go.mod
  • go 1.22.3
  • github.com/a8m/envsubst v1.4.2
  • github.com/evanphx/json-patch v5.9.0+incompatible
  • github.com/fatih/color v1.17.0
  • github.com/getsops/sops/v3 v3.9.0
  • github.com/gookit/validate v1.5.2
  • github.com/hashicorp/go-multierror v1.1.1
  • github.com/hexops/gotextdiff v1.0.3
  • github.com/invopop/jsonschema v0.12.0
  • github.com/joho/godotenv v1.5.1
  • github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06@525f6e181f06
  • github.com/siderolabs/image-factory v0.4.1
  • github.com/siderolabs/net v0.4.0
  • github.com/siderolabs/talos/pkg/machinery v1.8.0-alpha.0.0.20240514132626-b86edc6776f7
  • github.com/spf13/cobra v1.8.1
  • golang.org/x/mod v0.18.0
  • gopkg.in/yaml.v3 v3.0.1
  • sigs.k8s.io/yaml v1.4.0
hack/tsehelper/go.mod
  • go 1.22.3
  • github.com/budimanjojo/talhelper/v3 v3.0.2
  • github.com/google/go-containerregistry v0.19.2
  • github.com/sirupsen/logrus v1.9.3
  • gopkg.in/yaml.v3 v3.0.1
regex
pkg/config/defaults.go
  • siderolabs/talos v1.7.5
pkg/config/defaults.go
  • siderolabs/talos v1.7.5

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

github-actions
.github/workflows/golangci-lint.yaml
  • actions/setup-go v3
  • actions/checkout v3
  • golangci/golangci-lint-action v3
.github/workflows/release.yaml
  • actions/setup-go v3
  • actions/checkout v3
  • goreleaser/goreleaser-action v3
.github/workflows/test.yaml
  • actions/setup-go v3
  • actions/checkout v3
gomod
go.mod
  • go 1.19
  • github.com/a8m/envsubst v1.3.0
  • github.com/evanphx/json-patch v5.6.0+incompatible
  • github.com/fatih/color v1.13.0
  • github.com/ghodss/yaml v1.0.0
  • github.com/gookit/validate v1.4.2
  • github.com/joho/godotenv v1.4.0
  • github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06@525f6e181f06
  • github.com/spf13/cobra v1.5.0
  • github.com/talos-systems/crypto v0.3.6
  • github.com/talos-systems/net v0.3.2
  • github.com/talos-systems/talos/pkg/machinery v1.1.2
  • go.mozilla.org/sops/v3 v3.7.3
  • gopkg.in/yaml.v3 v3.0.1
  • sigs.k8s.io/yaml v1.3.0
regex
pkg/config/defaults.go
  • siderolabs/talos v1.1.2

  • Check this box to trigger a request for Renovate to run again on this repository

Upgrading to 1.3.X removes AESCBCEncryptionSecret

Per the release docs, the AESCBC key should remain on clusters that already have it configured (so they can decrypt existing secrets).

Changing the version in talconfig.yaml, running genconfig, and diff (talosctl apply-config --dry-run) yields:

-               ClusterAESCBCEncryptionSecret:    "<REDACTED FOR THE INTERNET>",
+               ClusterAESCBCEncryptionSecret:    "",

Provide Docker Image

I think it would be helpful to also distribute this in the form of a docker image.

[feat] add talosctl upgrade-k8s to gencommand

Currently we can define kubernetes versions in talconfig.
That's nice, but that's not automatically respected or updated by running talosctl upgrade-k8s.

It would be nice if gencommand could output talosctl upgrade-k8s, including the version referenced in talconfig. Otherwise it would create awkward mismatches between what is running in the cluster and talconfig

[bug] images do not get "registered" with factory

It seems that before images can be downloaded from the factory, they first need to be registed by making a get request towards the factory.

It would be best to implement this behavior where it's relevant. but the most logical place would likely be "genconfig", as that ensures the config build from said images references images that actually exist.

Maybe it would even be best to fetch the image strings from the factory directly, to ensure they are properly registered?

One might hit this when one has a order of extentions that does not match the website order (which would always trigger this) and/or is the first one to use a certain kernel argument and/or if one is the first to use a certain meta and/or if one is the first to pull a certain image at all.


The get request is documented here:
https://github.com/siderolabs/image-factory


Issues you hit when a image is not "registered" are shown here:

siderolabs/image-factory#61

Though thoroughly underdocumented.

Machine files support

Hi @budimanjojo !

I was trying to setup tailscale as a Talos extension and it needs machine/files support, was wondering if you could add that in pretty please?

machine:
  install:
    extensions:
      - image: ghcr.io/siderolabs/tailscale:1.44.0
  files:
    - content: |
        TS_AUTHKEY=<your auth key>
      permissions: 0o644
      path: /var/etc/tailscale/auth.env
      op: create

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Location: .github/renovate.json
Error type: Invalid JSON (parsing failed)
Message: Syntax error: expecting String near } ], }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.