grafana / k6-operator Goto Github PK

An operator for running distributed k6 tests.

License: Apache License 2.0

Makefile 2.78% Go 95.65% JavaScript 0.16% Smarty 1.42%

kubernetes kubernetes-operator kubernetes-operators load-testing performance-testing performance-engineering performance hacktoberfest

k6-operator's Introduction

k6 Operator

grafana/k6-operator is a Kubernetes operator for running distributed k6 tests in your cluster.

Read also the complete tutorial to learn more about how to use this project.

Setup

Prerequisites

The minimal prerequisite for k6-operator is a Kubernetes cluster and access to it with kubectl.

Deploying the operator

Bundle deployment

The easiest way to install the operator is with bundle:

curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl apply -f -

Bundle includes default manifests for k6-operator, including k6-operator-system namespace and k6-operator Deployment with latest tagged Docker image. Customizations can be made on top of this manifest as needs be, e.g. with kustomize.

Deployment with Helm

Helm releases of k6-operator are published together with other Grafana Helm charts and can be installed with the following commands:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm install k6-operator grafana/k6-operator

Passing additional configuration can be done with values.yaml (example can be found here):

helm install k6-operator grafana/k6-operator -f values.yaml

Complete list of options available for Helm can be found here.

Makefile deployment

In order to install the operator with Makefile, the following additional tooling must be installed:

go
kustomize

A more manual, low-level way to install the operator is by running the command below:

make deploy

This method may be more useful for development of k6-operator, depending on specifics of the setup.

Installing the CRD

The k6-operator includes custom resources called TestRun and PrivateLoadZone. These will be automatically installed when you do a deployment or install a bundle, but in case you want to do it yourself, you may run the command below:

make install

⚠️ K6 CRD has been substituted with TestRun CRD and will be deprecated in the future.

Usage

Samples are available in config/samples and e2e/, both for TestRun and for PrivateLoadZone.

Adding test scripts

The operator utilises ConfigMaps and LocalFile to serve test scripts to the jobs. To upload your own test script, run the following command to configure through ConfigMap:

ConfigMap

kubectl create configmap my-test --from-file /path/to/my/test.js

Note: there is a character limit of 1048576 bytes to a single configmap. If you need to have a larger test file, you'll need to use a volumeClaim or a LocalFile instead

VolumeClaim

There is a sample avaiable in config/samples/k6_v1alpha1_k6_with_volumeClaim.yaml on how to configure to run a test script with a volumeClaim.

If you have a PVC with the name stress-test-volumeClaim containing your script and any other supporting file(s), you can pass it to the test like this:

spec:
  parallelism: 2
  script:
    volumeClaim:
      name: "stress-test-volumeClaim"
      file: "test.js"

Note: the pods will expect to find script files in /test/ folder. If volumeClaim fails, it's the first place to check: the latest initializer pod does not generate any logs and when it can't find the file, it will terminate with error. So missing file may not be that obvious and it makes sense to check it manually. See #143 for potential improvements.

Example directory structure while using volumeClaim

├── test
│   ├── requests
│   │   ├── stress-test.js
│   ├── test.js

In the above example, test.js imports a function from stress-test.js and they would look like this:

// test.js
import stressTest from "./requests/stress-test.js";

export const options = {
  vus: 50,
  duration: '10s'
};

export default function () {
  stressTest();
}

// stress-test.js
import { sleep, check } from 'k6';
import http from 'k6/http';


export default () => {
  const res = http.get('https://test-api.k6.io');
  check(res, {
    'status is 200': () => res.status === 200,
  });
  sleep(1);
};

LocalFile

There is a sample avaiable in config/samples/k6_v1alpha1_k6_with_localfile.yaml on how to configure to run a test script inside the docker image.

Note: if there is any limitation on usage of volumeClaim in your cluster you can use this option, but always prefer the usage of volumeClaim.

Executing tests

Tests are executed by applying the custom resource TestRun to a cluster where the operator is running. The properties of a test run are few, but allow you to control some key aspects of a distributed execution.

# k6-resource.yml

apiVersion: k6.io/v1alpha1
kind: TestRun
metadata:
  name: k6-sample
spec:
  parallelism: 4
  script:
    configMap:
      name: k6-test
      file: test.js
  separate: false
  runner:
    image: <custom-image>
    metadata:
      labels:
        cool-label: foo
      annotations:
        cool-annotation: bar
    securityContext:
      runAsUser: 1000
      runAsGroup: 1000
      runAsNonRoot: true
    resources:
      limits:
        cpu: 200m
        memory: 1000Mi
      requests:
        cpu: 100m
        memory: 500Mi
  starter:
    image: <custom-image>
    metadata:
      labels:
        cool-label: foo
      annotations:
        cool-annotation: bar
    securityContext:
      runAsUser: 2000
      runAsGroup: 2000
      runAsNonRoot: true

The test configuration is applied using

kubectl apply -f /path/to/your/k6-resource.yml

Parallelism

How many instances of k6 you want to create. Each instance will be assigned an equal execution segment. For instance, if your test script is configured to run 200 VUs and parallelism is set to 4, as in the example above, the operator will create four k6 jobs, each running 50 VUs to achieve the desired VU count.

Script

The name of the config map that includes our test script. In the example in the adding test scripts section, this is set to my-test.

Separate

Toggles whether the jobs created need to be distributed across different nodes. This is useful if you're running a test with a really high VU count and want to make sure the resources of each node won't become a bottleneck.

Serviceaccount

If you want to use a custom Service Account you'll need to pass it into both the starter and runner object:

apiVersion: k6.io/v1alpha1
kind: TestRun
metadata:
  name: <test-name>
spec:
  script:
    configMap:
      name: "<configmap>"
  runner:
    serviceAccountName: <service-account>
  starter:
    serviceAccountName: <service-account>

Runner

Defines options for the test runner pods. This includes:

passing resource limits and requests
passing in labels and annotations
passing in affinity and anti-affinity
passing in a custom image

Starter

Defines options for the starter pod. This includes:

passing in custom image
passing in labels and annotations

k6 outputs

k6 Cloud output

k6 supports output to its Cloud with k6 run --out cloud script.js command. This feature is available in k6-operator as well for subscribed users. Note that it supports only parallelism: 20 or less.

To use this option in k6-operator, set the argument in yaml:

# ...
  script:
    configMap:
      name: "<configmap>"
  arguments: --out cloud
# ...

Then, if you installed operator with bundle, create a secret with the following command:

kubectl -n k6-operator-system create secret generic my-cloud-token \
    --from-literal=token=<COPY YOUR TOKEN HERE> && kubectl -n k6-operator-system label secret my-cloud-token "k6cloud=token"

Alternatively, if you installed operator with Makefile, you can uncomment cloud output section in config/default/kustomization.yaml and copy your token from the Cloud there:

# Uncomment this section if you need cloud output and copy-paste your token
secretGenerator:
- name: cloud-token
  literals:
  - token=<copy-paste-token-string-here>
  options:
    annotations:
      kubernetes.io/service-account.name: k6-operator-controller
    labels:
      k6cloud: token

And re-run make deploy.

This is sufficient to run k6 with the Cloud output and default values of projectID and name. For non-default values, extended script options can be used like this:

export let options = {
  // ...
  ext: {
    loadimpact: {
      name: 'Configured k6-operator test',
      projectID: 1234567,
    }
  }
};

Cleaning up between test runs

After completing a test run, you need to clean up the test jobs created. This is done by running the following command:

kubectl delete -f /path/to/your/k6-resource.yml

Multi-file tests

In case your k6 script is split between more than one JS file, you can simply create a configmap with several data entries like this:

kubectl create configmap scenarios-test --from-file test.js --from-file utils.js

If there are too many files to specify manually, kubectl with folder might be an option:

kubectl create configmap scenarios-test --from-file=./test

Alternatively, you can create an archive with k6:

k6 archive test.js [args]

The above command will create an archive.tar in your current folder unless -O option is used to change the name of the output archive. Then it is possible to put that archive into configmap similarly to JS script:

kubectl create configmap scenarios-test --from-file=archive.tar

In case of using an archive it must be additionally specified in your yaml for TestRun deployment:

# ...
spec:
  parallelism: 1
  script:
    configMap:
      name: "crocodile-stress-test"
      file: "archive.tar" # <-- change here

In other words, file option must be the correct entrypoint for k6 run.

Using extensions

By default, the operator will use grafana/k6:latest as the container image for the test jobs. If you want to use extensions built with xk6 you'll need to create your own image and override the image property on the TestRun kubernetes resource.

For example, create a Dockerfile with the following content:

# Build the k6 binary with the extension
FROM golang:1.20 as builder

RUN go install go.k6.io/xk6/cmd/xk6@latest
# For our example, we'll add support for output of test metrics to InfluxDB v2.
# Feel free to add other extensions using the '--with ...'.
RUN xk6 build \
    --with github.com/grafana/xk6-output-influxdb@latest \
    --output /k6

# Use the operator's base image and override the k6 binary
FROM grafana/k6:latest
COPY --from=builder /k6 /usr/bin/k6

Build the image based on this Dockerfile by executing:

docker build -t k6-extended:local .

Once the build is completed, push the resulting k6-extended:local image to an image repository accessible to your Kubernetes cluster. We can now use it as follows:

# k6-resource-with-extensions.yml

apiVersion: k6.io/v1alpha1
kind: TestRun
metadata:
  name: k6-sample-with-extensions
spec:
  parallelism: 4
  script:
    configMap:
      name: crocodile-stress-test
      file: test.js
  runner:
    image: k6-extended:local
    env:
      - name: K6_OUT
        value: xk6-influxdb=http://influxdb.somewhere:8086/demo

Note that we are overriding the default image with k6-extended:latest, providing the test runner with environment variables used by our included extensions.

Scheduling Tests

While the k6 operator doesn't support scheduling k6 tests directly, the recommended path for scheduling tests is to use the cronjobs object from k8s directly. The cron job should run on a schedule and run a delete and then apply of a k6 object

Running these tests requires a little more setup, the basic steps are:

Create a configmap of js test files (Covered above)
Create a configmap of the yaml for the k6 job
Create a service account that lets k6 objects be created and deleted
Create a cron job that deletes and applys the yaml

Add a configMapGenerator to the kustomization.yaml:

configMapGenerator:
  - name: <test-name>-config
    files:
      - <test-name>.yaml

Then we are going to create a service account for the cron job to use:

This is required to allow the cron job to actually delete and create the k6 objects.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: k6-<namespace>
rules:
  - apiGroups:
      - k6.io
    resources:
      - testruns
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: k6-<namespace>
roleRef:
  kind: Role
  name: k6-<namespace>
  apiGroup: rbac.authorization.k8s.io
subjects:
  - kind: ServiceAccount
    name: k6-<namespace>
    namespace: <namespace>
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: k6-<namespace>

We're going to create a cron job:

# snapshotter.yml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: <test-name>-cron
spec:
  schedule: "<cron-schedule>"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccount: k6
          containers:
            - name: kubectl
              image: bitnami/kubectl
              volumeMounts:
                - name: k6-yaml
                  mountPath: /tmp/
              command:
                - /bin/bash
              args:
                - -c
                - "kubectl delete -f /tmp/<test-name>.yaml; kubectl apply -f /tmp/<test-name>.yaml"
          restartPolicy: OnFailure
          volumes:
            - name: k6-yaml
              configMap:
                name: <test-name>-config

Namespaced deployment

By default, k6-operator watches TestRun and PriaveLoadZone custom resources in all namespaces. But it is possible to configure k6-operator to watch only a specific namespace by setting a WATCH_NAMESPACE environment variable for the operator's deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: k6-operator-controller-manager
  namespace: k6-operator-system
spec:
  template:
    spec:
      containers:
        - name: manager
          image: ghcr.io/grafana/k6-operator:controller-v0.0.14
          env:
            - name: WATCH_NAMESPACE
              value: "some-ns"
# ...

Uninstallation

You can remove the all resources created by the operator with bundle:

curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl delete -f -

Or with make command:

make delete

Developing Locally

Pre-Requisites

Run Tests

Test Setup

make test-setup (only need to run once)

Run Unit Tests

make test

Run e2e Tests

install kind and create a k8s cluster (or create your own dev cluster)
make e2e for kustomize and make e2e-helm for helm
validate tests have been run
make e2e-cleanup

k6-operator's People

Contributors

Stargazers

Watchers

Forkers

clix-dev-llc marvel-works devopstoday11 sphara-app mac8005 zbjjyy rbredtech metalglove jmdacruz amo151 dillonbrowne mycargus trongnd12 mateuszgrab knechtionscoding nerzhul wn-doolittle russel-yang brian-groux-hs drizzlingcattus nedae00 yrobla illmatikx victorlcm jdheyburn knmsk sysbind whogan00 ckavili lemonprogis capsulehealth gyume2021 isabella232 rm-rf-slant guillermotti barthv taiyow archit2602 walking-appa jarviliam ubbleai yanivbenzvi aimakun spoukke getimpala danielpalstra kneemaa cam-inc aecay blakemorgan peter-mcconnell egor-romanov ionicc teresadq vdenisov386 cmergenthaler alexisduf paulomf prashanth-volvocars mcandeia ivanape anish-poulose-maersk 13013swagr mwain markkupekkarinen wisesth emanuelef anatoly-bogatyrev nissessenap lanore78 jaxvicious patrick-janeiro esc-nyt tunguska55 pears-one landk1003 juliancantillo insoulan jwcastillo kaasops mohamediag luis-pinto-fanduel dkhachyan 0xf0d0 lplazas tomoyukisugiyama iamjerryliu168 h-lala andrei-trandafir ryanpark0203 vukor dangllucas agilgur5 yorugac fndyns vickyxie777 jorturfer aaguilartablada bitkickerbhs drawbridge-labs

k6-operator's Issues

Add information about configmap having an upper size limit in the readme

What @knechtionscoding said. We should probably mention this limitation in the docs though. 👍🏽

Originally posted by @simskij in #61 (comment)

SyntaxError: https://jslib.k6.io/k8s-distributed-execution/0.0.1/index.js

(ERRO[0003] SyntaxError: https://jslib.k6.io/k8s-distributed-execution/0.0.1/index.js: Unexpected token (2:4)
  1 |
> 2 |     <!DOCTYPE html>
    |     ^
  3 |     <html lang="en">
  4 |     <head>
  5 |       <meta charset="UTF-8"> at <eval>:2:28542(114))

Add notification support

The operator should be able to send notifications upon starting and finishing test runs.
To get a lot of options with little work, we should use https://github.com/containrrr/shoutrrr.

Configuration of notifications should be done by deploying a notification CR.
Some kind of draft proposal of how this could look:

apiVersion: k6.io/v1alpha1
kind: K6Notification
metadata:
  name: k6-notification
spec:
  start:
    - some://shoutrrr/url
  finish:
    - some://shoutrrr/url
  error:
    - some://shoutrrr/url

Update Golang to 1.17 along with dependencies

This involves Golang project and Dockerfile(s) both.

Add k6 extensions support

It would be very nice to have ability to run extensions on this distributed k6 deployment

arguments option issue

Hey guys, first of all thank you so much for your work!

I'm trying to execute the operator with the following spec:

apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample
spec:
  parallelism: 4
  script: aligator-stress-test
  arguments: --out kafka=brokers=dockerhost:9092,topic=load-output,format=json

however, when running the apply I'm getting the error

time="2021-04-23T20:02:25Z" level=error msg="unknown flag: --out kafka"

Could anyone help me out with this issue?

Error from server (NotFound): error when creating "STDIN": namespaces "k6-operator-system" not found

Hello, was following the nice blog over at https://k6.io/blog/running-distributed-tests-on-k8s/ and stumbled upon the following issue:

karl@Karls-MacBook-Pro operator (main) $ make deploy
go: creating new go.mod: module tmp
go: downloading sigs.k8s.io/controller-tools v0.3.0
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.3.0
go: downloading github.com/spf13/cobra v0.0.5
go: downloading k8s.io/apimachinery v0.18.2
go: downloading github.com/gobuffalo/flect v0.2.0
go: downloading k8s.io/apiextensions-apiserver v0.18.2
go: downloading k8s.io/api v0.18.2
go: downloading gopkg.in/yaml.v3 v3.0.0-20190905181640-827449938966
go: downloading golang.org/x/tools v0.0.0-20190920225731-5eefd052ad72
go: downloading github.com/fatih/color v1.7.0
go: downloading github.com/inconshreveable/mousetrap v1.0.0
go: downloading github.com/mattn/go-isatty v0.0.8
go: downloading github.com/mattn/go-colorable v0.1.2
/Users/karl/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
cd config/manager && /usr/local/bin/kustomize edit set image controller=ghcr.io/k6io/operator:latest
/usr/local/bin/kustomize build config/default | kubectl apply -f -
namespace/system created
customresourcedefinition.apiextensions.k8s.io/k6s.k6.io created
clusterrole.rbac.authorization.k8s.io/k6-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/k6-operator-metrics-reader created
clusterrole.rbac.authorization.k8s.io/k6-operator-proxy-role created
clusterrolebinding.rbac.authorization.k8s.io/k6-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/k6-operator-proxy-rolebinding created
Error from server (NotFound): error when creating "STDIN": namespaces "k6-operator-system" not found
Error from server (NotFound): error when creating "STDIN": namespaces "k6-operator-system" not found
Error from server (NotFound): error when creating "STDIN": namespaces "k6-operator-system" not found
Error from server (NotFound): error when creating "STDIN": namespaces "k6-operator-system" not found
make: *** [Makefile:54: deploy] Error 1

Any ideas?

K6 CRD object doesn't progress past `started`

The k6 CRD object does not progress past stage: started even after all the downstream jobs have finished.

Service account passing issue. - Not possible to pull custom image.

When trying to apply below Custom Resource with serviceAccount OR serviceAccountName defined, for custom image pull,
There is the issue with validation.

Yaml CR-k6-test-exec:

kind: K6
metadata:
  namespace: k6-operator-system
  name: k6-staging-1-vu-test
spec:
  serviceAccount: k6-pull
  parallelism: 1
  script: 
    configMap: 
        name: load-test-staging 
        file: k6-test-order-flow.js
  runner:
    image: <Private Repo>/devops/k6/runner:latest

kubectl apply -f CR-k6-test-exec -n k6-operator-system

Results:

error: error validating "CR-k6-test-exec": error validating data: ValidationError(K6.spec): unknown field "serviceAccount" in io.k6.v1alpha1.K6.spec; if you choose to ignore these errors, turn validation off with --validate=false

Bug: Two k6 Jobs called the same thing

If two k6 jobs are called the same thing, even in different namespaces, the k6 controller gets confused and does something like:

2021-07-13T15:08:22.864Z        INFO    controllers.K6  Waiting for pods to get ready   {"k6": "big-brother/k6-test"}
2021-07-13T15:08:22.865Z        INFO    controllers.K6  5/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:27.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:32.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:37.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:42.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:47.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:52.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:08:57.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:09:02.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:09:07.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:09:12.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:09:17.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:09:22.866Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}
2021-07-13T15:09:22.867Z        INFO    controllers.K6  9/4 pods ready  {"k6": "big-brother/k6-test"}

k6_start.go needs to be modified to poll for labels and namespaces as well to prevent issues

Check and confirm the installation and execution on Windows

Currently k6-operator has only a tutorial in blog post and a README for options. It has been suggested that a quick, step-by-step guide could be of use too.

Also to consider: move README content to a separate docs folder / .md files, to simplify options lookup, etc.

July 2023 update

There's now a new step-by-step guide in k6-docs here 🎉
This guide is OS-agnostic so some details could still be added. Most "in-demand" are likely specifics of Windows setup. It'd be good to confirm the installation is executing correctly on Windows and what the differences between Windows and Linux / MacOS are.

Add a way to configure the resource requirements for scheduling

@dgzlopes suggested that the k6 cr should offer a way to configure the resource requirements for scheduling a job on a node. This would allow multiple jobs to be scheduled to the same node without risking resource fatigue.

I think this is a really good suggestion that should be shortlisted on the "roadmap".

Handle k6 exit codes

Hi, i'm executing load tests in my kubernetes cluster but i have a problem when tests fails.

I need tests be executed only one time, and if these run succesfully o fails don't be executed again.
Currently if tests running ok these dont be executed again, but if test threshold faild automatically starter container is created and launch another pod to try run test again.

I leave my config files here, i tried to set abortOnFail in threshold and use abortTest() function but the problem persist.
I think it is a k6-operator behaviour, maybe you can help me.

This is my test file.

apiVersion: v1
kind: ConfigMap
metadata:
  name: k6-test
  namespace: k6-operator-system
data:
  test.js: |
    import http from 'k6/http';
    import { Rate } from 'k6/metrics';
    import { check, sleep, abortTest } from 'k6';

    const failRate = new Rate('failed_requests');

    export let options = {
      stages: [
        { target: 1, duration: '1s' },
        { target: 0, duration: '1s' },
      ],
      thresholds: {
        failed_requests: [{threshold: 'rate<=0', abortOnFail: true}],
        http_req_duration: [{threshold: 'p(95)<1', abortOnFail: true}],
      },
    };

    export default function () {
      const result = http.get('http://test/login/');
      check(result, {
        'http response status code is 200': result.status === 500,
      });
      failRate.add(result.status !== 200);
      sleep(1);
      abortTest();
    }

And this is my k6 definition.

apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample
  namespace: k6-operator-system
spec:
  parallelism: 1
  script:
    configMap:
      name: k6-test
      file: test.js
  arguments: --out influxdb=http://influxdb.influxdb:8086/test
  scuttle:
    enabled: "false"

I hope you can help me, thanks!

Support for imagePullSecrets

Ideally we can pass in an imagePullSecrets to the job to allow for images from custom docker repositories.

Create a guide for migration from one instance setup to multiple instance setup

Users of k6-operator can face migration issues where they need to re-configure both their setups and k6 scripts to distributed execution. Currently it's fully user's responsibility but a straight-forward guide on how to wrangle k6 executors into obedience without full understanding of their logic can be beneficial too. A related issue was raised in #90.

In general, this can be solved by re-computing number of VUs, e.g. if one was running X independent instances with Y VUs, for multiple instances setup they'll need X * Y VUs and parallelism set to X. This may be seen as basic arithmetic; however, a specific guide can both help people to operate between 1 instance VS N instances more easily and also serve as additional introduction into executors and execution segments.

Related issue: #94

Integration with Prometheus via statsd exporter

I'm trying to figure out how to get the k6 metrics into Prometheus. One alternative is to deploy https://github.com/prometheus/statsd_exporter and have k6 send the metrics there with the --out statsd argument (https://k6.io/docs/results-visualization/statsd/). Is there a way to inject the required environment variables (e.g., K6_STATSD_ADDR) with the operator?

Make `--quiet` a default argument

@dgzlopes suggested that to make it easier to analyze the output of a test spanning multiple nodes, --quiet should be set as a default argument. This makes sense given the limitation that we lack result aggregation.

Update the relevant docs and articles once this has been changed.

Setup CI for development and production builds

Using GitHub Actions.

How to share custom data along with the script?

For example scirpt need to run have 100k users ids, so how to

add data while invoking script
How to distribute the data between jobs, so for 4 parallel runs, 25K unique users would go each.

Bundle the build output as a single yaml file (pre-commit hook)

@dgzlopes made an excellent point about how adding a full deployment bundle to the repo would allow us to drop the make and kustomize pre-requisites for usage. Once this is done, the relevant documentation and articles on the k6 blog needs to be updated as well.

Too long: must have at most 1048576 bytes

Hi guys,

I am trying to create the configmap to save the test, but there seems to be a size limitation of 1Mb.

Do you know what other alternative can be implemented for this case?

Thanks a lot.

k6.spec.separate should append podAntiAffinity to Affinity and not overwrite it

Hello

I've been using k6.spec.separate = true to define a podAntiAffinity with other k6 runners, however I'd like to add my own affinity using k6.spec.runner.affinity, but this logic in the operator overwrites it:

if k6.Spec.Separate {
    job.Spec.Template.Spec.Affinity = newAntiAffinity()
}

Because of this, I have to set k8.spec.separate = false. While I can replicate the antiAffinity in my custom affinity block, it would be better UX to have it appended to a podAntiAffinity if there is one defined already.

Thanks
Joe

Add nodeSelectors and/or nodeAffinity/anti-affinity

The Kubernetes scheduler can be constrained to place a pod on particular nodes using few different options. In your POD specification, there are many ways in which you declare a POD should be dedicated to specific nodes, one of them is nodeSelectors and another one is nodeAffinity/anti-affinity.

I would like to constrain where I can place k6 runners. This would open up a lot of fancy possibilities.

For example:

Only use nodes from certain zones (and use Separate to spread the runners).
Run my test on some type of instance (e.g. M4).
Run my test on an isolated node pool of my cluster.

Documentation need OR feature request: Where to provide custom runner image

As per title,
it would be good to be able to specify somewhere in config for k6-operator, which images to use.
k6 allow right now to port test via postman, jmeter, etc. For postman example to execute scripts, I need some javascript libraries. It would be nice, if the libraries could be added on top of ghcr.io/grafana/operator:latest and custom image used as a runner.

Allow the K6 CRD to set environment variables for the runners.

I did not get -e argument working.

I got following errors:

time="2021-06-04T14:51:39Z" level=error msg="Couldn't flush a batch" error="write udp 127.0.0.1:36848->127.0.0.1:8125: write: connection refused" output=statsd

While having following manifest:

apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample
spec:
  parallelism: 4
  script: crocodile-stress-test
  arguments: -e K6_STATSD_ADDR=graphite-1622816495:8125 --out statsd

This popped up job with commands:

k6 run --quiet --execution-segment=0:1/4 --execution-segment- sequence=0,1/4,2/4,3/4,1 -e K6_STATSD_ADDR=graphite-1622816495:8125 --out statsd /test/test.js --address=0.0.0.0:6565 --paused

/\ |‾‾| /‾‾/ /‾‾/
/\ / \ | |/ / / /
/ \/ \ | ( / ‾‾\
/ \ | |\ \ | (‾) |
/ __________ \ |__| \__\ \_____/ .io
execution: local
script: /test/test.js
output: statsd (localhost:8125)

But this doesn't work locally either. Unless I run the command as follow:

$ K6_STATSD_ADDR=graphite-1622816495:8125 k6 run --quiet --execution-segment=0:1/4 --execution-segment-sequence=0,1/4,2/4,3/4,1 -e K6_STATSD_ADDR=graphite-1622816495:8125 --out statsd ./test.js --address=0.0.0.0:6565 --paused

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

ERRO[0000] The moduleSpecifier "./test.js" couldn't be found on local disk. Make sure that you've specified the right path to the file. If you're running k6 using the Docker image make sure you have mounted the local directory (-v /local/path/:/inside/docker/path) containing your script and modules so that they're accessible by k6 from inside of the container, see https://k6.io/docs/using-k6/modules#using-local-modules-with-docker.

Originally posted by @mikaelkundert in #15 (comment)

Using -e won't work, just as @mikaelkundert points out. What that does is that it allows the k6 script to use it as an environment variable. However, the statsd output is initiated outside of the actual test script. Instead, we should add support for providing env vars through the CRD.

[Feature Request] Allow disabling of execution segments when parallelism > 1

Great operator, thanks for making it!

I would like some fine-grain control on the runners that are created. I noticed that when parallelism is > 1, the number of VUs is shared across all the workers. My number of VUs is somewhat dynamic, and I currently have runners quitting prematurely because there is no work to do.

It would be great if I could prevent this from happening.

Investigate potential switch of the logging library

k6-operator is using logr while k6 is using logrus. This results in k6-operator using 2 logging libs at once since #86. Consider switching to logrus in k6-operator or perhaps using an implementation lib like https://github.com/bombsimon/logrusr or another solution.

"k6": executable file not found in $PATH: unknown

Hi guys,

We are trying to launch some tests but it seems that the starter cannot start.

I added k6 to the image and now we have another error

Thanks

Allow for scheduled jobs

Allow jobs to have a crontab schedule on them that causes a cronjob to be created instead of just a batch job.

This would allow for ongoing tests rather than just manually kicked off testing.

Write Unit Tests for Controller

https://book.kubebuilder.io/cronjob-tutorial/writing-tests.html points us in the right direction to write and run unit tests for the controller so we know when things break.

starter container error when using istio

When i run k6-operator with istio, then starter always change to an error state.

starter pod log:

CR file:

apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample-with-extensions
spec:
  parallelism: 4
  script:
    configMap:
      name: crocodile-stress-test
      file: test.js
  arguments: --out prometheus=namespace=k6&port=9090
  ports:
  - containerPort: 9090
    name: metric
  runner:
    image: extension-image:0.0.2
    metadata:
      annotations:
        prometheus.io/scrape: "true"
  scuttle:
    waitForEnvoyTimeout: "20"

I can't understand why this is happening. I would like to ask for help if there is any solution.

Add support to ship results to k6 cloud

Here we're running a lot of k6 instances in parallel, but all of them are part of the same test.

I would like to ship the results from all these instances as a unique test run on the cloud.

Design and implement a config for distributed execution setup

Currently k6-operator supports only the simplest straight-forward distributed setup when 100% of execution is spread in equal chunks 100% / N to N runners. But k6 is far more versatile which is fully utilized by k6 Cloud in distribution parameter. It would be good to include this versatility into k6-operator as well.

Setup of k6-operator already allows passing all necessary options; the main caveat is in how best to configure distribution. Sample options to consider:

N fraction numbers

parallelism: 4
distribution: 1/3, 1/6, 1/6, 1/3

N percentages

parallelism: 4
distribution: 0.33, 0.17, 0.17, 0.33

Since number of runners can be large, more generalized approach to defining distribution may be necessary. E.g., for 30 runners with 1/60 load and 10 runners with 1/20 load:

parallelism: 40
distribution: 30:1/60, 10:1/20

Something else?

No matter what config option is chosen, validation will be needed for:
a) the format of distribution is sufficient, e.g. exactly N numbers
b) final distribution adds up to 1

Provide an easy way to configure metrics outputs

Using an outdated schema for operator-sdk

/Users/hans.knecht/go/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
operator-sdk generate kustomize manifests -q
WARN[0000] Config version 3-alpha has been stabilized as 3, and 3-alpha is no longer supported. Run `operator-sdk alpha config-3alpha-to-3` to upgrade your PROJECT config file to version 3

I don't know all the information to feel comfortable running that upgrade

Submit the operator to OperatorHub

OperatorHub.io aims to be a central place for k8s operators developed by the community.

I think it's interesting to submit the k6 operator!

The process looks straightforward :)

The operator specifically requires kustomize version 3.8.1 to work

Ok I reverted to 3.8.1 and it seemed to not throw the kitchen sink at me.

For anyone else, you can use this to install a specific version of kustomize:
https://github.com/kubernetes-sigs/kustomize/blob/master/hack/install_kustomize.sh

Then run:
./install_kustomize.sh [WANTED_VERSION]

The downloaded bin file should be moved to the appropriate place.

In all honesty though, this should probably be fixed somehow.

Originally posted by @basickarl in #19 (comment)

K6 can't launch jobs when using LinkerD

When launching jobs in a namespace that has LinkerD enabled, the starter job is unable to contact the downstream pods to trigger them. If I run the jobs in a workspace without LinkerD, everything works as expected.

starter pod logs

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.231 port 6565: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.195 port 6565: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.179 port 6565: Connection refused
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to 100.64.110.113 port 6565: Connection refused

Starter pod configuration
https://gist.github.com/MarkSRobinson/702ae2c4ebb8f2a02509c72c30c80531

Latest build requires Istio/Envoy to be installed on the cluster.

Hey!
I wanted to start using your operator, but unfortunately running any test with latest release v0.0.7rc2 fails at connecting to Envoy.

2021-09-03 15:32:35 | 2021-09-03T13:32:35Z scuttle: Scuttle 1.3.5 starting up, pid 1
-- | --
  |   | 2021-09-03 15:32:35 | 2021-09-03T13:32:35Z scuttle: Logging is now enabled
  |   | 2021-09-03 15:32:35 | 2021-09-03T13:32:35Z scuttle: Blocking until Envoy starts
  |   | 2021-09-03 15:32:35 | 2021-09-03T13:32:35Z scuttle: Polling Envoy (1), error: internal_service: dial tcp 127.0.0.1:15000: connect: connection refused
  |   | 2021-09-03 15:32:35 | 2021-09-03T13:32:35Z scuttle: Polling Envoy (2), error: internal_service: dial tcp 127.0.0.1:15000: connect: connection refused
  |   | 2021-09-03 15:32:37 | 2021-09-03T13:32:37Z scuttle: Polling Envoy (3), error: internal_service: dial tcp 127.0.0.1:15000: connect: connection refused
  |   | 2021-09-03 15:32:38 | 2021-09-03T13:32:38Z scuttle: Polling Envoy (4), error: internal_service: dial tcp 127.0.0.1:15000: connect: connection refused
  |   | 2021-09-03 15:32:39 | 2021-09-03T13:32:39Z scuttle: Polling Envoy (5), error: internal_service: dial tcp 127.0.0.1:15000: connect: connection refused
  |   | 2021-09-03 15:32:42 | 2021-09-03T13:32:42Z scuttle: Polling Envoy (6), error: internal_service: dial tcp 127.0.0.1:15000: connect: connection refused

Running your operator with ghcr.io/k6io/operator:v0.0.6 image fixes the issue and it can operate as expected. I am running another service mesh and I am not likely to migrate to Istio.

Thanks,
Szymon

Could not deploy the operator in cluster - error: unable to recognize "STDIN": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"

Actual Error:

kubectl version:

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:32:41Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}

kubectl api-resources has following crd in the list:

Sending k6 results to datadog?

When I'm testing k6 locally through docker-compose, I am able to see the results being populated in the datadog web dashboard. However, I'm struggling to convert this behavior to the k8s operator. Below is the configuration I've got so far, with datadog deployed as a helm chart in a namespace called monitors and the k6 operator deployed at version 0.0.6

docker-compose.yml

  api-smoke-test:
    image: loadimpact/k6
    entrypoint: k6 run --out statsd index.js
    depends_on:
      - core
      - api-gateway
    links:
      - datadog
    working_dir: /test/
    environment:
      - K6_STATSD_ADDR=datadog:8125
    volumes:
      - $PWD:/test/

  datadog:
    image: datadog/agent:latest
    ports:
      - 8125
    environment:
      - DD_API_KEY=${DD_API_KEY}
      - DD_SITE=datadoghq.com
      - DD_DOGSTATSD_NON_LOCAL_TRAFFIC=1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /proc/:/host/proc/:ro
      - /sys/fs/cgroup:/host/sys/fs/cgroup:ro

datadog-values.yaml

datadog:
  dogstatsd:
    port: 8125
    useHostPort: true
    nonLocalTraffic: true

resource.yml

---
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample
spec:
  parallelism: 4
  arguments: --out statsd
  script: 
    configMap: 
      name: k6-test
      file: test.js
  ports:
    - containerPort: 8125

configmap.yml

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: k6-test
data:
  test.js: |
    import http from "k6/http";
    import { check, group, fail } from "k6";
    
    // add new endpoint url suffixes here to expand smoke test
    // these endpoints should match exactly the keys in the setup
    let ERR_MSG = "API smoke test failed for endpoint:"

Current output of kubectl apply -f resource.yml

❯ kubectl logs k6-sample-1-v2tdg

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

time="2021-08-12T22:20:02Z" level=warning msg="Executor 'default' is disabled for segment 0:1/4 due to lack of work!"
  execution: local
     script: /test/test.js
     output: statsd (localhost:8125)

  scenarios: (25.00%) 1 scenario, 0 max VUs, 0s max duration (incl. graceful stop):
           * default: 1 iterations for each of 0 VUs (maxDuration: 10m0s, gracefulStop: 30s)

time="2021-08-12T22:20:05Z" level=warning msg="No script iterations finished, consider making the test duration longer"

     vus.......: 0 min=0 max=0
     vus_max...: 0 min=0 max=0

time="2021-08-12T22:20:05Z" level=error msg="Couldn't flush a batch" error="write udp 127.0.0.1:52620->127.0.0.1:8125: write: connection refused" output=statsd

I believe what I'm missing is either the K6_STATSD_ADDR or the DD_AGENT_HOST environment variables (or both) which can be set with the below code. However I'm not certain how to add these env vars to the k6-sample pods.

env:
- name: DD_AGENT_HOST
  valueFrom:
    fieldRef:
      fieldPath: status.hostIP

Any ideas or helpful advice on how I can accomplish this?

k6-operator fails creating custom resource on k8 1.22

Ref: https://k6.io/blog/running-distributed-tests-on-k8s/

$ make deploy
...
namespace/k6-operator-system created
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/k6s.k6.io configured
...

apiextensions.k8s.io/v1beta1 is no longer available in k8 1.22.

apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-sample
spec:
  parallelism: 4
  script: crocodile-stress-test

Custom resource creation fails even though the script passed in is already a string.

$ kubectl apply -f ./custom-resource.yml
The K6 "k6-sample" is invalid: spec.script: Invalid value: "string":
spec.script in body must be of type object: "string"

$ kubectl version --short
Client Version: v1.22.2
Server Version: v1.22.2

Support for Custom Annotations/Labels on Starter and Runner

Currently if running k6 in an istio environment the starter fails because the curl command is sent too quickly. The istio/envoy proxy hasn't started at the point and there is no retry built in.

While adding a retry is probably a reasonable option (one for a separate issue), being able to add custom annotations like:

annotations:
  proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'

or:

sidecar.istio.io/inject: "false"

would be really useful.

I'm going to take a crack at this, but broadly I think it can be thought of as creating the struct/spec for a starter and a runner object, and then accepting annotations/labels for each of them solves the issue

How to run external script

How to run external script, for example the one stored in git or docker images? Especially for scripts that consist of multiple files.

Create K8s Service Account for k6 in namespace

There are a whole lot of permissions that are assigned to the default service account in the k6 namespace. It is best practice to create a service account specifically for the operator. See here as an example. Should be a pretty small addition.

Switch from REST API to k6 CLI in starter job

The starter is currently relying on REST API of k6 to start the runners. It's workable but the more "officially supported" way is to use the CLI, e.g. k6 resume. Investigate and switch to using the k6 image and CLI command in the starter pod instead of curl container.

Initiating pod in #86 can be used as an additional baseline here and/or for refactor.

leader-election-role role refers to non-existent configmaps/status resource

The leader-election-role at https://github.com/k6io/operator/blob/main/config/rbac/leader_election_role.yaml#L19 refers to a resource configmaps/status, but as far as I know configmaps don't have a status (https://v1-20.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#configmap-v1-core). Perhaps I'm misunderstanding something though?

I removed that item from the rules array and the project appears to run just fine.

Support for Service Meshes

Running k6 with istio enabled on the namespace breaks k6 currently.

Recognizing there is no great solution to istio sidecars and Jobs yet, I think the best option is to allow overriding of the containers that run the jobs so that something like: https://github.com/redboxllc/scuttle can be used as a wrapper.

[Improvement Request] Refactor kustomize to a helm chart template

Hi everyone,

As this project is starting to grow, I propose we move the distribution of this operator to a helm chart installation. Currently to enable or disable features, we need to comment or uncomment some portions of kustomize yamls. This is very error prone and hard to automate. By creating a helm chart installation, these changes can be enabled or disable just by changing a single values.yaml file.

Also, using a helm chart and release tags it would be possible to use a single command to install everything. What do you guys think?

Support Istio

Currently the starter job doesn't support istio. This means the test job pods launch, starter pod launches, and then immediately errors. Either because it can't communicate with the test job pods, it is waiting for the istio side car, etc.

Currently you have to setup a peer auth to open up the specific port that the starter job uses, ensure that there is no side car, and manage the tls/http settings.

Instead, I'd like to propose that k6 operator starter image simply uses a tool like: https://github.com/redboxllc/scuttle

Scuttle provides a wrapper around the shell command to wait for istio sidecar to launch and then quit afterwards properly.

This would involve a small modification to the containers/curl.go, from:

		Command: []string{
			"sh",
			"-c",
			strings.Join(parts, ";"),
		},

to:

		Command: []string{
			"scuttle",
			"sh",
			"-c",
			strings.Join(parts, ";"),
		},

And then it would require the building of a second docker image:

FROM radial/busyboxplus:curl

COPY --from=redboxoss/scuttle:latest /scuttle /bin/scuttle

and then obviously using that image. This could be put behind an istio flag, if desired, but I don't think there is any side effect to use scuttle in a non-istio environment, from the docs:

This application, if provided an ENVOY_ADMIN_API environment variable, will poll indefinitely with backoff, waiting for envoy to report itself as live, implying it has loaded cluster configuration (for example from an ADS server). Only then will it execute the command provided as an argument.