deliveryhero / helm-charts Goto Github PK

View Code? Open in Web Editor NEW

466.0 466.0 271.0 2.64 MB

Helm Charts ⛵ @ Delivery Hero ⭐

License: Apache License 2.0

Smarty 31.54% Shell 7.14% Open Policy Agent 4.30% Python 12.64% Mustache 44.16% Dockerfile 0.21%

helm-charts's People

Contributors

Stargazers

Watchers

Forkers

ronnymaas nreymundo fernandosiq-dh lukas-reining cnmcavoy tle15 flmarins wintera-asto ergton invidian pavlozabudskyi zabudskyi atkei simonhargos hongbo-miao ppalucha alyragab yanetleon-qz carlindesautels tthietanen jeff-l-tsai avoinsystems michalbcz en-en balak-cv elmariofredo scott45 constantin07 joshuastern js-timbirkett bilalcaliskan linzhengen hvvit alex-mozejko hemanthmudalaiah yurrriq gbavoz1 androna-xm pankajpreet matthew-beckett brunoterkaly satoruitaya mattykuzyk anatoliistepaniuk youngwookim doytsujin muratyetimov offerpad yosri-alkishawi avneet-bansal hochuenw-dd tomikiss devholic choseh rj7 invia-de zaza-dh yuenm18 shulima spirosoik alekc-forks ghazgkull cyberw sudhiratcloudera denisgolius heojay ds-evo ckavili arano-kai bangpound com6056 nujragan gelarpambudi piyushbhatewara pierophp runlight bcthuringer pcfens izzie88 kjayantmenon chotiwat blend xirdigh hixio-mh ithings-ch maxisam fivesheep alexrogalskiy aiceball gabegorelick cognifloyd digihide therealdwright lander2k2 timbo-desosa-volusion giffgaff vtatarin ashishwakchaure georgevazj mikebranstein

helm-charts's Issues

Using locust chart without config map

Hey folks 👋 !

Trying to gauge interest for an option to point to a locust file within the container. I'm working on a setup with this locust chart that makes it a bit complicated to use a config map and I would rather just point to a file inside a custom made image.

Currently the only way to do this would be to

override the image for both worker and master
change the args for both image to point to the path where the file is

the problem with that is that I would lose all the templated values currently set in the locust command.
Before just making a PR I wanted to get your opinion on an option for a local_locust_file here -> https://github.com/deliveryhero/helm-charts/blob/master/stable/locust/templates/worker-deployment.yaml#L51

This combined with a custom image would allow me to completely bypass the configmap and point to whatever I have in the container.

[stable/locust] ValueError: invalid literal for int() with base 10: ''

I tried to install to minikube the locust with the following command as you proposed :
helm install my-release deliveryhero/locust

but as soon as i start swarming i see these logs in my master node

Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/gevent/pywsgi.py", line 999, in handle_one_response self.run_application() File "/usr/local/lib/python3.8/site-packages/gevent/pywsgi.py", line 945, in run_application self.result = self.application(self.environ, self.start_response) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2464, in __call__ return self.wsgi_app(environ, start_response) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2450, in wsgi_app response = self.handle_exception(e) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1867, in handle_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/usr/local/lib/python3.8/site-packages/locust/web.py", line 366, in wrapper return view_func(*args, **kwargs) File "/usr/local/lib/python3.8/site-packages/locust/web.py", line 148, in swarm user_count = int(request.form["user_count"]) ValueError: invalid literal for int() with base 10: '' 2021-04-23T10:13:43Z {'REMOTE_ADDR': '::ffff:127.0.0.1', 'REMOTE_PORT': '59420', 'HTTP_HOST': 'localhost:8089', (hidden keys: 33)} failed with ValueError

How could i solve this?

Latest Version of Locust uses venv

Hello,

The latest version of locust uses venv so its not possible to run the pip install package step if they are required. I had to update the locustcmd value to /opt/venv/bin/locust. But when starting the container it fails. We need an option to specify parameters for pip too.

Error on container on start:

ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.

[stable/Cachet] Unable to set imagePullSecret via Helm Chart

imagePullSecret is not a valid entry in values.yaml for Cachet's Helm chart

503 Service Temporarily Unavailable on https://charts.deliveryhero.io/

This is breaking us. Please fix!

memory and cpu limit/request needed for node-problem-detector chart

b64enc function passed multiple arguments

Hi @max-rocket-internet. This is related to my previous PR. I noticed that an error is thrown because the b64enc method is being passed multiple arguments, when really the results of the cat function should be piped to it. I made the change, created the PR and checks are passing:

#201

Thanks

[Node-problem-detector] Streaming of warnings in namespace events

I tried deploying node-problem-detector using helm chart and a custom values.yaml file but I got these warnings in the events

`87s Warning FailedMount pod/node-problem-detector-crcn4 MountVolume.SetUp failed for volume "custom-config" : failed to sync configmap cache: timed out waiting for the condition

87s Warning FailedMount pod/node-problem-detector-crcn4 MountVolume.SetUp failed for volume "node-problem-detector-token-wqv6s" : failed to sync secret cache: timed out waiting for the condition

87s Warning FailedMount pod/node-problem-detector-zvqfc MountVolume.SetUp failed for volume "custom-config" : failed to sync configmap cache: timed out waiting for the condition

85s Warning FailedToUpdateEndpoint endpoints/node-problem-detector Failed to update endpoint node-problem-detector/node-problem-detector: Operation cannot be fulfilled on endpoints "node-problem-detector": the object has been modified; please apply your changes to the latest version and try again`

For the first three events I think this happened because the service account doesn't have permission to mount secrets or configMaps but I don't understand the reason behind the last event.

Distribute workers evenly across nodes

Hi there I am testing with cluster that starts with 100 nodes and with 200 workers. I noticed the workers do not evenly distribute across the nodes so I made change below to work deployment's template.spec. What do you think about incorporating this change as a parameter like worker.topologySpreadConstraints.enabled with default value false?

worker-deployment.yaml 
spec:
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              type: worker

[stable/locust] Is that a problem?

Thank you for your wonderful job.
I have a problem when I deploy this helm-locust:

My Environment

AWS EKS a cluster with two nodes
EC2 instance type node: m5a.4xlarge
Kubernetes version: 1.8
ALB: aws-alb-controller

What I wan to ask

Here is the log of the slave, I do not know waht's wrong with my setting

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/locust/util/exception_handler.py", line 13, in wrapper
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/locust/rpc/zmqrpc.py", line 22, in send
    raise RPCError("ZMQ sent failure") from e
locust.exception.RPCError: ZMQ sent failure
[2021-04-16 04:55:45,511] locust-worker-f968676b4-s7jsf/INFO/locust.util.exception_handler: Exception found on retry 3: -- retry after 5s
[2021-04-16 04:55:45,512] locust-worker-f968676b4-s7jsf/ERROR/locust.util.exception_handler: ZMQ sent failure
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/locust/rpc/zmqrpc.py", line 20, in send
    self.socket.send(msg.serialize(), zmq.NOBLOCK)
  File "/usr/local/lib/python3.8/site-packages/zmq/green/core.py", line 193, in send
    msg = super(_Socket, self).send(data, flags, copy, track, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/zmq/sugar/socket.py", line 491, in send
    return super(Socket, self).send(data, flags=flags, copy=copy, track=track)
  File "zmq/backend/cython/socket.pyx", line 720, in zmq.backend.cython.socket.Socket.send
  File "zmq/backend/cython/socket.pyx", line 767, in zmq.backend.cython.socket.Socket.send
  File "zmq/backend/cython/socket.pyx", line 247, in zmq.backend.cython.socket._send_copy
  File "zmq/backend/cython/socket.pyx", line 242, in zmq.backend.cython.socket._send_copy
  File "zmq/backend/cython/checkrc.pxd", line 22, in zmq.backend.cython.checkrc._check_rc
zmq.error.Again: Resource temporarily unavailable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/locust/util/exception_handler.py", line 13, in wrapper
    return function(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/locust/rpc/zmqrpc.py", line 22, in send
    raise RPCError("ZMQ sent failure") from e
locust.exception.RPCError: ZMQ sent failure
[2021-04-16 04:55:48,432] locust-worker-f968676b4-s7jsf/INFO/locust.util.exception_handler: Retry failed after 3 times.
[2021-04-16 04:55:48,432] locust-worker-f968676b4-s7jsf/ERROR/locust.runners: RPCError found when sending heartbeat: ZMQ sent failure
[2021-04-16 04:55:48,432] locust-worker-f968676b4-s7jsf/INFO/locust.runners: Reset connection to master
[2021-04-16 04:55:48,433] locust-worker-f968676b4-s7jsf/ERROR/locust.runners: RPCError found when receiving from master: ZMQ network broken
[2021-04-16 04:55:50,513] locust-worker-f968676b4-s7jsf/INFO/locust.util.exception_handler: Retry failed after 3 times.
[2021-04-16 04:55:50,513] locust-worker-f968676b4-s7jsf/ERROR/locust.runners: Temporary connection lost to master server: ZMQ sent failure, will retry later.

Here is detail of my node and locust-master/locust-worker

ADINESS GATES	NODE
locust-master-5f97c65754-lhd9l	ip-192-168-128-51.ap-northeast-1.compute.internal
locust-worker-f968676b4-s7jsf	ip-192-168-224-255.ap-northeast-1.compute.internal

Detail about my Cluster

root@4ba2d7112384:/src/eks/ansible_code_deploy/locust-cluster/playbook# k get all,cm
NAME                                 READY   STATUS    RESTARTS   AGE
pod/locust-master-5f97c65754-lhd9l   1/1     Running   0          4h45m
pod/locust-worker-f968676b4-s7jsf    1/1     Running   0          4h45m

NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                        AGE
service/kubernetes   ClusterIP   10.100.0.1       <none>        443/TCP                                        5h25m
service/locust       NodePort    10.100.221.232   <none>        5557:30798/TCP,5558:31057/TCP,8089:30481/TCP   4h45m

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/locust-master   1/1     1            1           4h45m
deployment.apps/locust-worker   1/1     1            1           4h45m

NAME                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/locust-master-5f97c65754   1         1         1       4h45m
replicaset.apps/locust-worker-f968676b4    1         1         1       4h45m

NAME                                         REFERENCE                  TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/locust   Deployment/locust-worker   <unknown>/40%   1         10        1          4h45m

NAME                                                             SERVICE-NAME   SERVICE-PORT   TARGET-TYPE   AGE
targetgroupbinding.elbv2.k8s.aws/k8s-default-locust-94b94d80e9   locust         8089           ip            4h34m

NAME                               DATA   AGE
configmap/locust-config            1      4h45m
configmap/locust-lib               0      4h45m
configmap/my-loadtest-locustfile   1      4h45m

Add namespace support

The chart as built today does not allow a user to decide the namespace the chart should be deployed in.

This should be added as part of the metadata to support different namespace then the default namespace

[stable/listmonk] run upgrade job

https://github.com/deliveryhero/helm-charts/blob/master/stable/listmonk/templates/init-db.yaml#L23

2022/04/07 06:22:47 upgrade.go:102: there are 1 pending database upgrade(s): [v2.1.0]. The last upgrade was v2.0.0. Backup the database and run listmonk --upgrade

when upgrading the image tag this happens, I would suggest to also run the init script with another listmonk --upgrade command as it, per doc, should be idempotent.

[stable/locust] Passing arguments to the master/worker docker-entrypoint.sh scripts

I have args that I want to pass to the docker-entrypoint.sh scripts ran on both the master and worker pods.

Can either the configmap-config.yaml or the worker/master-deployment.yaml files be updated to support this?

Or is the intent that the command value can be overridden somehow.

Extra command args for master + Using basic auth causes failure for readinessProbe

Hello,

I noticed a couple items when using this chart, let me know if they should be broken up into two separate issues, but for now...

in the values.yaml, when I have args_include_default: true, and I want to include extra arguments in args: [], it ends up removing the default arguments when I really expected it to add to them. I can see in the template how the if-condition is causing that, but wondering now if that is intended behaviour, or instead should be changed so that args are added above the default args.
When I enabled basic authentication on the UI, which I performed just by passing a web-auth arg to master, the deployment failed due to
Readiness probe failed: HTTP probe failed with statuscode: 401

Now, of course one can enable basic authentication on a readiness probe, but where our chart is basically an umbrella chart pointing to this one, I don't have as much control. It would be nice if we could enable basic auth more concretely through the values.yaml file, and the template know when it is enabled and adjust the readinessProbe accordingly. Let me know if there's another way to get this working.

Thanks

After adding "-headless", workers cannot connect to locust master.

Hi,

I have deployed this helm chart in AKS cluster. And it works very well. Here is my deployment command in Azure cloud shell:

helm install my-loadtest-release ./ \
  --set worker.hpa.enabled=true \
  --set worker.replicas=3 \
  --set service.type=LoadBalancer

(Current directory is /helm-charts/stable/locust, and it worked very well with the webpage-mode.)

However, after I simply added "--headless" in master-deployment.yaml, the workers cannot connect to locust master. Here is where I changed:

{{ else if .Values.master.args_include_default }}
        args:
          - --master
          - --headless
          - --locustfile=/mnt/locust/{{ .Values.loadtest.locust_locustfile }}
          - --host={{ .Values.loadtest.locust_host }}
          - --loglevel={{ .Values.master.logLevel }}

I just added one line "- --headless", and all connection from worker to master failed:

What I have tried:

I have tried combination of "--run-time", "--spawn-rate", "--run-time", "--expected-workers", all end up with same results.
I have tried changing service name from {{ template "locust.fullname" . }} to {{ template "locust.fullname" . }}-master, same with master's name. Still the same.
I have tried to manully bind master-port using "--master-bind-port" in master-deployment.yaml, and "--master-host", "--master-port" in worker-deployment.yaml" (In all kinds of combinations). Still the same.
I have tried to use Environment variables to do all mentioned changes, and get the same results.

I think it is neccessary to be able to use headless mode for automated load test. So much appreciated if anyone gets curious and help with this issue. Thank you!

My working environments:

Resource: An AKS (Azure Kubernetes Service) Cluster
Command portal: Azure Cloud Shell

Support topologySpreadConstraints in overprovisioner deployments

https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/

Mount a file as a volume mount

I have a file which is greater than 3MB. I need this file as an input for my locust. I need a placeholder to mount this file but not as a config map as the size is big.

Worker not connecting to the master

Hi, I am facing the issue below when trying to deploy the chart on Kubernetes (EKS). Any idea on how I can resolve this

[2021-05-03 10:05:38,099] locust-master-694777b9c5-sq44k/ERROR/locust.runners: RPCError found when receiving from client: ZMQ interrupted message
[2021-05-03 10:05:38,434] locust-master-694777b9c5-sq44k/INFO/locust.runners: Reset connection to worker
[2021-05-03 10:05:39,435] locust-master-694777b9c5-sq44k/INFO/locust.runners: Reset connection to worker
[2021-05-03 10:05:40,436] locust-master-694777b9c5-sq44k/INFO/locust.runners: Reset connection to worker
[2021-05-03 10:05:41,437] locust-master-694777b9c5-sq44k/INFO/locust.runners: Reset connection to worker
[2021-05-03 10:05:42,439] locust-master-694777b9c5-sq44k/INFO/locust.runners: Reset connection to worker

[stable/node-problem-detector] "permission denied" when using script in custom_monitor_definitions

With this values.yaml to add a custom plugin :

values.yaml

settings:
  prometheus_address: 0.0.0.0
  prometheus_port: 20257

  custom_monitor_definitions:
    drainme.sh: |
      #!/bin/bash
      set -euo pipefail

      echo "Checking commands..."
      for cmd in curl jq
      do
        if ! command -v $cmd &> /dev/null
        then
          echo "installing $cmd..."
          apt update -qq >/dev/null 2>&1 && apt install -y $cmd -qq >/dev/null 2>&1
        fi
      done

      # Point to the internal API server hostname
      APISERVER=https://kubernetes.default.svc

      # Path to ServiceAccount token
      SERVICEACCOUNT=/var/run/secrets/kubernetes.io/serviceaccount

      # Read this Pod's namespace
      NAMESPACE=$(cat ${SERVICEACCOUNT}/namespace)

      # Read the ServiceAccount bearer token
      TOKEN=$(cat ${SERVICEACCOUNT}/token)

      # Reference the internal certificate authority (CA)
      CACERT=${SERVICEACCOUNT}/ca.crt

      # Call node API with NODE_NAME
      echo "Checking current node = $NODE_NAME..."
      drainme=$(curl -s --cacert ${CACERT} --header "Authorization: Bearer ${TOKEN}" -X GET ${APISERVER}/api/v1/nodes/${NODE_NAME} | jq -r '.metadata.labels.drainme')

      if [[ "$drainme" == "true" ]]
      then
        echo "Drain requested"
        exit 1
      fi
      echo "No drain needed"
      exit 0
    drainme.json: |
      {
        "plugin": "custom",
        "pluginConfig": {
          "invoke_interval": "10s",
          "timeout": "3m",
          "max_output_length": 80,
          "concurrency": 1
        },
        "source": "drainme-custom-plugin-monitor",
        "conditions": [
          {
            "type": "DrainRequest",
            "reason": "NoDrain",
            "message": "No drain"
          }
        ],
        "rules": [
          {
            "type": "permanent",
            "condition": "DrainRequest",
            "reason": "DrainMe",
            "path": "/custom-config/drainme.sh"
          }
        ]
      }
  custom_plugin_monitors:
  - /custom-config/drainme.json

and this installation process :

helm template npd deliveryhero/node-problem-detector \
      --version  \
      --namespace "2.2.1" \
      --set image.tag="v0.8.10" \
      --values values.yaml \
| kubectl apply -f -

the resulting NPD pods logs contain error message because the custom plugin script has no execution permission :
Error in starting plugin "/custom-config/drainme.sh": error - fork/exec /custom-config/drainme.sh: permission denied

Full pod log

I0505 18:39:35.316142       1 log_monitor.go:79] Finish parsing log monitor config file /config/kernel-monitor.json: {WatcherConfig:{Plugin:kmsg PluginConfig:map[] LogPath:/dev/kmsg Lookback:5m Delay:} BufferSize:10 Source:kernel-monitor DefaultConditions:[{Type:KernelDeadlock Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}] Rules:[{Type:temporary Condition: Reason:OOMKilling Pattern:Killed process \d+ (.+) total-vm:\d+kB, anon-rss:\d+kB, file-rss:\d+kB.*} {Type:temporary Condition: Reason:TaskHung Pattern:task [\S ]+:\w+ blocked for more than \w+ seconds\.} {Type:temporary Condition: Reason:UnregisterNetDevice Pattern:unregister_netdevice: waiting for \w+ to become free. Usage count = \d+} {Type:temporary Condition: Reason:KernelOops Pattern:BUG: unable to handle kernel NULL pointer dereference at .*} {Type:temporary Condition: Reason:KernelOops Pattern:divide error: 0000 \[#\d+\] SMP} {Type:temporary Condition: Reason:Ext4Error Pattern:EXT4-fs error .*} {Type:temporary Condition: Reason:Ext4Warning Pattern:EXT4-fs warning .*} {Type:temporary Condition: Reason:IOError Pattern:Buffer I/O error .*} {Type:temporary Condition: Reason:MemoryReadError Pattern:CE memory read error .*} {Type:permanent Condition:KernelDeadlock Reason:AUFSUmountHung Pattern:task umount\.aufs:\w+ blocked for more than \w+ seconds\.} {Type:permanent Condition:KernelDeadlock Reason:DockerHung Pattern:task docker:\w+ blocked for more than \w+ seconds\.} {Type:permanent Condition:ReadonlyFilesystem Reason:FilesystemIsReadOnly Pattern:Remounting filesystem read-only}] EnableMetricsReporting:0xc00046071e}
I0505 18:39:35.316349       1 log_watchers.go:40] Use log watcher of plugin "kmsg"
I0505 18:39:35.316630       1 log_monitor.go:79] Finish parsing log monitor config file /config/docker-monitor.json: {WatcherConfig:{Plugin:journald PluginConfig:map[source:dockerd] LogPath:/var/log/journal Lookback:5m Delay:} BufferSize:10 Source:docker-monitor DefaultConditions:[{Type:CorruptDockerOverlay2 Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoCorruptDockerOverlay2 Message:docker overlay2 is functioning properly}] Rules:[{Type:temporary Condition: Reason:CorruptDockerImage Pattern:Error trying v2 registry: failed to register layer: rename /var/lib/docker/image/(.+) /var/lib/docker/image/(.+): directory not empty.*} {Type:permanent Condition:CorruptDockerOverlay2 Reason:CorruptDockerOverlay2 Pattern:returned error: readlink /var/lib/docker/overlay2.*: invalid argument.*} {Type:temporary Condition: Reason:DockerContainerStartupFailure Pattern:OCI runtime start failed: container process is already dead: unknown}] EnableMetricsReporting:0xc00046109a}
I0505 18:39:35.316664       1 log_watchers.go:40] Use log watcher of plugin "journald"
I0505 18:39:35.316943       1 custom_plugin_monitor.go:81] Finish parsing custom plugin monitor config file /custom-config/drainme.json: {Plugin:custom PluginGlobalConfig:{InvokeIntervalString:0xc0003ecda0 TimeoutString:0xc0003ecdb0 InvokeInterval:10s Timeout:3m0s MaxOutputLength:0xc000461740 Concurrency:0xc000461750 EnableMessageChangeBasedConditionUpdate:0x223ebfd} Source:drainme-custom-plugin-monitor DefaultConditions:[{Type:DrainRequest Status: Transition:0001-01-01 00:00:00 +0000 UTC Reason:NoDrain Message:No drain}] Rules:[0xc000673490] EnableMetricsReporting:0x219a95b}
I0505 18:39:35.318331       1 k8s_exporter.go:54] Waiting for kube-apiserver to be ready (timeout 5m0s)...
I0505 18:39:35.411141       1 node_problem_detector.go:63] K8s exporter started.
I0505 18:39:35.411244       1 node_problem_detector.go:67] Prometheus exporter started.
I0505 18:39:35.411257       1 log_monitor.go:111] Start log monitor /config/kernel-monitor.json
I0505 18:39:35.411306       1 log_monitor.go:111] Start log monitor /config/docker-monitor.json
I0505 18:39:35.414268       1 log_watcher.go:80] Start watching journald
I0505 18:39:35.414296       1 custom_plugin_monitor.go:112] Start custom plugin monitor /custom-config/drainme.json
I0505 18:39:35.414311       1 problem_detector.go:76] Problem detector started
I0505 18:39:35.411712       1 log_monitor.go:235] Initialize condition generated: [{Type:KernelDeadlock Status:False Transition:2022-05-05 18:39:35.411686514 +0000 UTC m=+0.171292006 Reason:KernelHasNoDeadlock Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status:False Transition:2022-05-05 18:39:35.411686668 +0000 UTC m=+0.171292135 Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}]
I0505 18:39:35.414754       1 custom_plugin_monitor.go:296] Initialize condition generated: [{Type:DrainRequest Status:False Transition:2022-05-05 18:39:35.414740692 +0000 UTC m=+0.174346186 Reason:NoDrain Message:No drain}]
I0505 18:39:35.414817       1 log_monitor.go:235] Initialize condition generated: [{Type:CorruptDockerOverlay2 Status:False Transition:2022-05-05 18:39:35.414806985 +0000 UTC m=+0.174412467 Reason:NoCorruptDockerOverlay2 Message:docker overlay2 is functioning properly}]
E0505 18:39:35.501582       1 plugin.go:164] Error in starting plugin "/custom-config/drainme.sh": error - fork/exec /custom-config/drainme.sh: permission denied
I0505 18:39:35.501697       1 custom_plugin_monitor.go:276] New status generated: &{Source:drainme-custom-plugin-monitor Events:[{Severity:info Timestamp:2022-05-05 18:39:35.501637077 +0000 UTC m=+0.261242614 Reason:NoDrain Message:Node condition DrainRequest is now: Unknown, reason: NoDrain}] Conditions:[{Type:DrainRequest Status:Unknown Transition:2022-05-05 18:39:35.501637077 +0000 UTC m=+0.261242614 Reason:NoDrain Message:Error in starting plugin. Please check the error log}]}
E0505 18:39:45.418820       1 plugin.go:164] Error in starting plugin "/custom-config/drainme.sh": error - fork/exec /custom-config/drainme.sh: permission denied
E0505 18:39:55.418852       1 plugin.go:164] Error in starting plugin "/custom-config/drainme.sh": error - fork/exec /custom-config/drainme.sh: permission denied
[...]

If I edit the daemonset manifest to add defaultMode: 0755, it is working well.

- name: custom-config
     configMap:
       name: npd-node-problem-detector-custom-config
       defaultMode: 0755

How the chart is supposed to handle this permission without this defaultMode please ?
Is is possible to add this execution permission on the plugin script with the chart custom plugin configuration ?

Created wrongly

Source for images

Hi,
I'm trying to find the source code for the image 940776968316.dkr.ecr.eu-west-1.amazonaws.com/deliveryhero/kubecost-reports-exporter.
Is that open sourced anywhere?
Thanks!

locust: installing extensions?

This chart will also create configmaps for storing the locust files in Kubernetes, this way there is no need to build custom docker images.

How can I install extensions for a deployed locust instance?

Reference(s):

[stable/cluster-overprovisioner] Definitions of the PodDisruptionBudget in the Helm Chart

I think the PodDisruptionBudget is necessary to evict cluster-overprovisioner pods normally.
So, I would like to define the PodDisruptionBudget in the Helm Chart and add a parameter which controls if Helm Chart creates PodDisruptionBudget or not.

[node-problem-detector] Should mount /dev/kmsg by default

kernel-monitor.json references /dev/kmsg:
https://github.com/kubernetes/node-problem-detector/blob/e7fe0b20dc8279130d41218d12802d85d6730728/config/kernel-monitor.json#L3

However, it is not mounted by default to the container.

Our workaround is to add the following to values.yaml:

extraVolumes:
  - name: kmsg
    hostPath:
      path: /dev/kmsg

extraVolumeMounts:
  - name: kmsg
    mountPath: /dev/kmsg
    readOnly: true

Getting "chart not found" error when deploying cluster-overprovisioner chart through terraform

I am trying to update a cluster-overprovisioner helm release so that it pulls the chart from deliveryhero. The current release is configured to pull an old chart version "0.2.6" from "https://kubernetes-charts.storage.googleapis.com". I have changed the helm_release resource so that it pulls the overprovisioner chart from deliveryhero instead:

resource "helm_release" "splunk_overprovisioner" {
  name          = "splunk-nodes-overprovisioner"
  chart         = "deliveryhero/cluster-overprovisioner"
  repository    = "https://charts.deliveryhero.io/"
  version.         = "0.4.2"
  namespace     = kubernetes_namespace.splunk.metadata[0].name
  force_update  = true
  recreate_pods = true
  values = [
    <<END_OF_VALUES

# This is set to the same values as ../k8s/splunk*/values.yaml
resources:
  requests:
    cpu: 2
    memory: 2Gi

tolerations:
- key: splunk
  operator: Equal
  value: "true"
  effect: NoSchedule

replicaCount: 1

END_OF_VALUES
    ,
  ]
}

Terraform returns an error during apply:
Error: chart "deliveryhero/cluster-overprovisioner" version "0.4.2" not found in https://charts.deliveryhero.io/ repository

If I manually add the repo and search via CLI, I see the chart and version are the same as what I entered in the helm_resource:

$ helm3 repo add deliveryhero https://charts.deliveryhero.io/
$ helm3 search repo deliveryhero
NAME                                             	CHART VERSION	APP VERSION	DESCRIPTION
deliveryhero/cluster-overprovisioner             	0.4.2        	1.0        	This chart provide a buffer for cluster autosca...

Terraform version 0.12.26
Terraform Helm provider version: 1.3.0

[pg-repack-scheduler] Keep Looping

The logic on .sh file keeps looping around the same table and won't go to the next. It keeps adding J to the pg_repack command for infinite loop.

[stable/k8s-resources] allow Secret type other than Opaque

I'm using k8s-resources to create kubernetes secrets for access to a private registry. The Secrets created default to the type Opaque but they need to be of type kubernetes.io/dockerconfigjson.

I have submitted pull request 318 to address this issue.

[stable/wiremock] avoid deprecated ingress api

avoid deprecated API in:
https://github.com/deliveryhero/helm-charts/blob/master/stable/wiremock/templates/ingress.yaml#L4
and move to versions supported by recent k8s versions.

[Node-problem-detector] helm template --namespace <NAMESPACE> does not set metadata.namespace on daemonset resource yaml

Issue

I'm using the helm template command to generate the resource yamls for node-problem-detector.
But helm template --namespace <NAMESPACE> ... does not set metadata.namespace field on the daemonset yaml due to which daemonset pods are getting installed in the default namespace.

To reproduce the issue:

1.  helm repo add deliveryhero https://charts.deliveryhero.io/ && helm fetch deliveryhero/node-problem-detector --untar=true
2. cd node-problem-detector/
3. helm template npd . --namespace npd > node-problem-detector.yaml
4. Inspect the manifests. Daemonset yaml doesn't have a namespace field

Reason

This is because the daemonset template yaml is missing metadata.namespace configuration field.

FIX

To fix this, we just need to add the metadata.namespace configuration field in the resource yaml.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: {{ include "node-problem-detector.fullname" . }}
  namespace: {{ .Release.Namespace }}
 . . .

Not able to deploy cachet any version on k8s

I am trying to deploy the cachet 2.3.15+ version on k8s but getting the following error for deployment component

can't create temp file '/var/www/html/.envXXXXXX': Read-only file system

Could some one help is there any solution for this as we have tried mounting the file system, init container etc but nothing worked out.

Concurrent request upper bound

Hi there

Thank you for this contribution to the community. I would like to scale to 10 million concurrent requests from a single Kubernetes cluster. Curious if you all have data on concurrent request upper bound or typical Kubernetes configs you use with this chart

Cheers,
Stew

Could not find any locustfile! Ensure file ends in '.py'

Hello!

I follow the instructions in README.
Created configmaps from my local files (validated that they were created successfully):
kubectl create configmap my-loadtest-locustfile --from-file locustfile.py
kubectl create configmap my-loadtest-lib --from-file lib

This is the content of locustfile.py:

from locust import HttpUser, task, between

class QuickstartUser(HttpUser):
    wait_time = between(1, 2.5)

    @task
    def call_a(self):
        self.client.get("/a")

The contents or lib folder were copied from the example.

Then I install Helm chart with the command from README:

helm install locust deliveryhero/locust \
   --set loadtest.name=my-loadtest \
   --set loadtest.locust_locustfile_configmap=my-loadtest-locustfile \
   --set loadtest.locust_lib_configmap=my-loadtest-lib

After this I see the following error in master and worker pods:
Could not find any locustfile! Ensure file ends in '.py' and see --help for available options.

Please help :-)
I am just the README steps and it is not working...

[stable/node-problem-detector] Policy v1beta1 depreciation fail

This commit 0f7f891 broke on EKS v1.224 where policy v1 does not contain PSP.

[stable/locust] Installation gets stuck at ContainerCreating

I tried to install the Locust from stable/locust. However, The installation gets stuck every time with pods getting stuck in ContainerCreating state.

 ~/helm-chart | master ?2  k logs -f locust-release-master-5d46dbd476-q4lkp                                                                                                          ok | 4s | k8-staging kube
Error from server (BadRequest): container "locust" in pod "locust-release-master-5d46dbd476-q4lkp" is waiting to start: ContainerCreating

  Normal   Scheduled    29m                   default-scheduler  Successfully assigned default/locust-release-master-5d46dbd476-q4lkp to gke-k8-staging-k8-staging-pool-9b8dd468-16k0
  Warning  FailedMount  24m                   kubelet            Unable to attach or mount volumes: unmounted volumes=[locustfile lib], unattached volumes=[config locust-release-master-token-75pk7 locustfile lib]: timed out waiting for the condition
  Warning  FailedMount  19m (x13 over 29m)    kubelet            MountVolume.SetUp failed for volume "locustfile" : configmap "my-loadtest-locustfile" not found
  Warning  FailedMount  13m (x4 over 27m)     kubelet            Unable to attach or mount volumes: unmounted volumes=[locustfile lib], unattached volumes=[locustfile lib config locust-release-master-token-75pk7]: timed out waiting for the condition
  Warning  FailedMount  9m7s (x2 over 22m)    kubelet            Unable to attach or mount volumes: unmounted volumes=[locustfile lib], unattached volumes=[locust-release-master-token-75pk7 locustfile lib config]: timed out waiting for the condition
  Warning  FailedMount  2m44s (x21 over 29m)  kubelet            MountVolume.SetUp failed for volume "lib" : configmap "my-loadtest-lib" not found

Initially I tried creating configMap before running helm install. Later just installed chart without configmap(assuming it should take the default locustfile). Both the time installation failed.

P.S: I'm using GKE cluster.

Maybe update the default locust version?

Latest release is now 2.1 :) @max-rocket-internet

[stable/locust] Using an Ingress with the v0.9.10 returns 404

Hello,

I'm installing this chart on one of our clusters. Unfortunately, when enabling the ingress the ingress controller returns the standard NGINX 404 Not Found error page, rather than routing correctly to the service. Removing the asterisk from the path fixes it. I was hoping there was an ingress.hosts[].path variable I could set in the values file, but it is hardcoded in the template.

The path was changed from / to /* in #127, and I see it was changed in order to fix a bug related to static files.

I'm pretty new to helm and I'm not sure if this is a problem with the chart or with the cluster. We definitely have used asterisks in paths before, on other clusters. This cluster is new to me, but the thing I know is different from the others is that it has 2 Ingress Controllers. (I also had to set the kubernetes.io/ingress.class annotation for it to work at all.)

For now, I can just install v0.9.9 and it works fine. But I was hoping it could be made configurable, or perhaps just provide some pointers to determine whether it's a cluster problem.

If you decide to make it configurable, I'm happy to submit a PR.

Please let me know if you need additional information.

Thank you.

[FEATURE] Add support for ingressClassName in locust helm chart

Is your feature request related to a problem? Please describe.
kubernetes.io/ingress.class annotation is progressively replaced with ingressClassName in the ingress as of v1.18. See this related blog post.

Describe the solution you'd like
Support for ingressClassName in helm chart

See related issue and merged code

Prometheus Integration with Node-Problem-Dectector which IP to assign for Prometheus Exporter?

Hi Guys please throw some light here...I am trying to integrate Prometheus with Node-Problem-Dectector and want to know which IP to assign to Prometheus Exporter in values.yaml ?

Applying Prometheus Node Exporter IP gives error that IP Cannot be binded. May I know what IP to assign for Prometheus Exporter. Please Help Guys.

Helm repo yields 404 for http head requests

Hey guys,
Background: trying to setup Artifactory as a proxy for https://charts.deliveryhero.io/

However HTTP HEAD requests for https://charts.deliveryhero.io/ and https://charts.deliveryhero.io/index.yaml yield 404.
Normal GET method works fine obviously. Might be blocked by CloudFlare.

This seems to make the repo not being recognised properly by Artifactory. I'd appreciate help enabling it.

Thanks,
Bart

$ curl https://charts.deliveryhero.io/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to ChartMuseum!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to ChartMuseum!</h1>
<p>If you see this page, the ChartMuseum web server is successfully installed and
working.</p>

<p>For online documentation and support please refer to the
<a href="https://github.com/helm/chartmuseum">GitHub project</a>.<br/>

<p><em>Thank you for using ChartMuseum.</em></p>
</body>
</html>

$ curl -I https://charts.deliveryhero.io/
HTTP/2 404
date: Tue, 22 Jun 2021 15:09:23 GMT
content-type: application/json; charset=utf-8
content-length: 22
access-control-allow-credentials: true
access-control-allow-headers: DNT,X-CustomHeader,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization
access-control-allow-methods: GET, PUT, POST, DELETE, PATCH, OPTIONS
access-control-allow-origin: https://requests.syslogistics.io
x-request-id: 493555779594cc422da0dad290eef56c
cf-cache-status: DYNAMIC
cf-request-id: 0ad5dde85e0000cc7ff98b9000000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 663665ba2eedcc7f-WAW

$ curl -I https://charts.deliveryhero.io/index.yaml
HTTP/2 404
date: Tue, 22 Jun 2021 15:09:30 GMT
content-type: application/json; charset=utf-8
content-length: 22
access-control-allow-credentials: true
access-control-allow-headers: DNT,X-CustomHeader,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization
access-control-allow-methods: GET, PUT, POST, DELETE, PATCH, OPTIONS
access-control-allow-origin: https://requests.syslogistics.io
x-request-id: b26b802303966d4527a5207cfa4cbba4
cf-cache-status: DYNAMIC
cf-request-id: 0ad5de05140000ffd0ed32a000000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 663665e81f75ffd0-WAW

Latest chart releases sometimes aren't availabe in repo

I'm seeing strange behavior with this chart repo where latest chart releases are available for some time, but then gone.

For example at the moment of writing I don't see latest node-problem-detector release

% helm pull deliveryhero/node-problem-detector --version 2.2.3
Error: chart "node-problem-detector" matching 2.2.3 not found in deliveryhero index. (try 'helm repo update'): no chart version found for node-problem-detector-2.2.3

In the same time I can successfully download previous version 2.2.2.

Also observing similar behavior with cluster-overprovisioner.

[stable/locust] WebAuth Basic64 encoding is incorrect

When master.auth.enabled=true, master-deployment.yaml template generates base64 {username}:{password} string here.
https://github.com/deliveryhero/helm-charts/blob/master/stable/locust/templates/master-deployment.yaml#L138

It uses cat function, but it inserts white spaces between strings.
https://masterminds.github.io/sprig/strings.html#cat

Given

master
  auth:
    enabled: true
    username: "myuser"
    password: "mypassword"

BAD

value: Basic {{ cat .Values.master.auth.username ":" .Values.master.auth.password | b64enc }}
  -> it generates base64enc from "myuser : mypassword" ( white spaced )

GOOD

value: Basic {{ printf "%s:%s" .Values.master.auth.username .Values.master.auth.password | b64enc }}
  -> it generates base64enc from "myuser:mypassword" ( no white spaced )

Cannot use helm chart with CLI parameters

I am trying to test various values from CLI using the following command:

helm install overprovisioner deliveryhero/cluster-overprovisioner --set deployments[0].resources.requests.cpu="2000m"

and I am getting the following error:

Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: [unknown object type "nil" in Deployment.metadata.labels.app.cluster-overprovisioner/deployment, unknown object type "nil" in Deployment.spec.selector.matchLabels.app.cluster-overprovisioner/deployment, unknown object type "nil" in Deployment.spec.template.metadata.labels.app.cluster-overprovisioner/deployment]

A command without any parameters work well:

helm install overprovisioner deliveryhero/cluster-overprovisioner

Any hint what is wrong ?

[stable/locust] Artifacthub.io's Image Security Rating is F for this chart

This might be a locustio/locust issue since this chart is using the canonical locustio docker image.
https://artifacthub.io/packages/helm/deliveryhero/locust

It is a cause for concern when using this chart, but the other locust charts on the site don't show the "image security rating".

[cluster-overprovisioner] Helm chart doesn't honor namespace

The cluster-overprovisioner helm chart doesn't honor the namespace passed in to helm. This is because the chart is missing the typical metadata configuration on its kubernetes resources.

To reproduce this problem:

helm fetch deliveryhero/cluster-overprovisioner --version 0.7.2 --untar=true
helm template my-release cluster-overprovisioner --namespace overprovisioner > cluster-overprovisioner.yaml
Inspect the manifests output to the cluster-overprovisioner.yaml file. Note that none of them are configured to use the "overprovisioner" namespace passed to helm in step 2.

The fix for this is pretty simple. This chart's resources should be configured to honor the Helm release namespace. Like so:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "{{ $fullname }}-{{ .name }}"
  namespace: {{ .Release.Namespace }}
  ...

Wiremock documentation outdated

Wiremock documentation seems outdated: https://artifacthub.io/packages/helm/deliveryhero/wiremock

For example:

helm install my-wiremock deliveryhero/wiremock \ --set consumer=my-consumer --set "consumer.stubs.my-service1-stubs=/mnt/my-service1-stubs" \ --set "consumer.stubs.my-service2-stubs=/mnt/my-service2-stubs"

There is no consumer, its consumer.name as per Default Values:

`consumer:
name: example

environment: {}

args_include_default: true

args: []

stubs: {}

initContainer: []

initVolume: []`

Even the files and mapping setting is incorrect. I need this to be updated. Unable to figure out how to load the files and mappings.

Please help.

[stable/locust]: failure to install when auth is enabled

Locust chart version: 0.20.2
When auth is enabled by -

master:
  auth:
    enabled: true
    username: "abc"
    password: "abc"

we are getting an error

Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(Deployment.spec.template.spec.containers[0].args[1]): invalid type for io.k8s.api.core
.v1.Container.args: got "map", expected "string"

@yosri2005

deliveryhero / helm-charts Goto Github PK

helm-charts's People

Contributors

Stargazers

Watchers

Forkers

helm-charts's Issues

My Environment

What I wan to ask

Detail about my Cluster

Issue

Reason

FIX

BAD

GOOD

Recommend Projects

Recommend Topics

Recommend Org

Jobs