kubevirt / cluster-network-addons-operator Goto Github PK

Deploy additional networking components on top of your Kubernetes cluster

License: Apache License 2.0

Makefile 1.05% Dockerfile 0.22% Shell 11.99% Go 86.75%

cluster-network-addons-operator's Introduction

KubeVirt

KubeVirt is a virtual machine management add-on for Kubernetes. The aim is to provide a common ground for virtualization solutions on top of Kubernetes.

Introduction

Virtualization extension for Kubernetes

At its core, KubeVirt extends Kubernetes by adding additional virtualization resource types (especially the VM type) through Kubernetes's Custom Resource Definitions API. By using this mechanism, the Kubernetes API can be used to manage these VM resources alongside all other resources Kubernetes provides.

The resources themselves are not enough to launch virtual machines. For this to happen the functionality and business logic needs to be added to the cluster. The functionality is not added to Kubernetes itself, but rather added to a Kubernetes cluster by running additional controllers and agents on an existing cluster.

The necessary controllers and agents are provided by KubeVirt.

As of today KubeVirt can be used to declaratively

Create a predefined VM
Schedule a VM on a Kubernetes cluster
Launch a VM
Stop a VM
Delete a VM

To start using KubeVirt

Try our quickstart at kubevirt.io.

See our user documentation at kubevirt.io/docs.

Once you have the basics, you can learn more about how to run KubeVirt and its newest features by taking a look at:

To start developing KubeVirt

To set up a development environment please read our Getting Started Guide. To learn how to contribute, please read our contribution guide.

You can learn more about how KubeVirt is designed (and why it is that way), and learn more about the major components by taking a look at our developer documentation:

Architecture - High-level view on the architecture
Components - Detailed look at all components
API Reference

Useful links

The KubeVirt SIG-release repo is responsible for information regarding upcoming and previous releases.

KubeVirt to Kubernetes version support matrix - Verify the versions of KubeVirt that are built and supported for your version of Kubernetes
Noteworthy changes for the next KubeVirt release - Pre-release notes for the upcoming release
Release schedule - For our current and previous releases

Community

If you got enough of code and want to speak to people, then you got a couple of options:

Follow us on Twitter
Chat with us on Slack via #virtualization @ kubernetes.slack.com
Discuss with us on the kubevirt-dev Google Group
Stay informed about designs and upcoming events by watching our community content

Related resources

Submitting patches

When sending patches to the project, the submitter is required to certify that they have the legal right to submit the code. This is achieved by adding a line

Signed-off-by: Real Name <[email protected]>

to the bottom of every commit message. Existence of such a line certifies that the submitter has complied with the Developer's Certificate of Origin 1.1, (as defined in the file docs/developer-certificate-of-origin).

This line can be automatically added to a commit in the correct format, by using the '-s' option to 'git commit'.

License

KubeVirt is distributed under the Apache License, Version 2.0.

Copyright 2016

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

FOSSA Status

cluster-network-addons-operator's People

Contributors

Stargazers

Watchers

cluster-network-addons-operator's Issues

CSV subscription fails

Subscription fails with: one or more requirements couldn't be found

adjust local cluster for nmstate

In order to make it possible to test nmstate on the local cluster, we should install NetworkManager there.

Keep past versions of OLM manifests

Follow this structure https://github.com/operator-framework/operator-registry#manifest-format

release v0.3.0

Let's use this issue to track what is needed for the next stable release.

I want #29 in there. It will include kubemacpool and sr-iov that were not shipped in v0.2.0. @SchSeba @booxter are you aware of any must-have fixes? The main use of a new release would be to ship addons operator in hyperconverged-cluster-operator.

Add option to read image registry for components from operator

In order to make it easier to use operator on custom registry, make it possible to use operator's registry as a registry for components too.

In case USE_OPERATOR_REGISTRY_FOR_COMPONENTS is set to true, ready operator image registry and use it as a base registry for components. In that case, components' images should be set to "$OPERATOR_REGISTRY/name-of-the-image:tag_of_the_image".

Fix kubemacpool RBAC

:latest kubemacpool has extended RBAC needs. That needs to be reflected in this repo.

kubemacpool is not completely removed

There is an issue caused by kubemacpool component after its deprovision.

I started local cluster and deployed kubemacpool there:

make cluster-up cluster-sync
cat <<EOF | ./cluster/kubectl.sh create -f -
apiVersion: networkaddonsoperator.network.kubevirt.io/v1alpha1
kind: NetworkAddonsConfig
metadata:
  name: cluster
spec:
  kubeMacPool:
   startPoolRange: "02:00:00:00:00:00"
   endPoolRange: "FD:FF:FF:FF:FF:FF"
EOF

Then I removed the NetworkAddonsConfig object, so kubemacpool was removed:

./cluster/kubectl.sh delete networkaddonsconfig cluster

Finally I wanted to create a Pod:

cat <<EOF | ./cluster/kubectl.sh create -f -
apiVersion: v1
kind: Pod
metadata:
  name: samplepod
spec:
  containers:
  - name: samplepod
    command: ["/bin/sh", "-c", "sleep 99999"]
    image: alpine
EOF

That however failed with:

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling admission webhook "mutatepods.example.com": Post https://kubemacpool-service.kubemacpool-system.svc:443/mutate-pods?timeout=30s: service "kubemacpool-service" not found

@SchSeba I see kubemacpool creates MutatingWebhookConfiguration "mutating-webhook-configuration". We need to set its owner reference to something that is created by manifests, maybe the Service? Not sure if we can do something like https://github.com/kubevirt/cluster-network-addons-operator/blob/master/pkg/controller/networkaddonsconfig/controller.go#L177 on regular objects (not a controller). Maybe https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/ helps.

BTW we should renamed it to something kubemacpool specific.

OpenShift 4 Multus is not detected

We overwrite Multus on OpenShift 4. Fix the OpenShift network operator detection.

Support configurable operator namespace

Currently it is not possible to run the operator in arbitrary namespace due to hardcoded "configmap namespace".

Apply objects in dry run first

Not sure if it will be possible (due to missing namespaces). It should prevent us from partial updates where half of components is successfully redeployed and other half stuck.

Support upgrade rollback

In case system stopped working after upgrade, we should make rollback possible. This rollback would be only allowed if no other changes but version bump were done.

controller logging refactoring

Let's try to make logging code and log output cleaner.

We can improve the way we do logging in the main controller https://github.com/kubevirt/cluster-network-addons-operator/blob/master/pkg/controller/networkaddonsconfig/networkaddonsconfig_controller.go.

Lines like:

log.Printf("could not apply (%s) %s/%s: %v", obj.GroupVersionKind(), obj.GetNamespace(), obj.GetName(), err)
err = errors.Wrapf(err, "could not apply (%s) %s/%s", obj.GroupVersionKind(), obj.GetNamespace(), obj.GetName())

Should be:

err = errors.Wrapf(err, "could not apply (%s) %s/%s", obj.GroupVersionKind(), obj.GetNamespace(), obj.GetName())
log.Printf(err.Error())

Or even better, those errors are all raised to Reconcile method, we can log them once there (or maybe they are already logged using return reconcile.Result{}, err and we just make mess in the log output?).

Finally, we have don't use Wrapf properly:

// this should not pass err twice
err = errors.Wrapf(err, "failed to retrieve previously applied configuration: %v", err)

make components' image urls configurable in make manifests

Rename NetworkAddonsConfig(s) to NetworkAddon(s)

Let's save a little bit of space and time (pronouncing it) and rename NetworkAddonsConfig object to NetworkAddons. That would mimic KubeVirt and CDI operators with their KubeVirt and CDI objects.

On the other hand, it does not follow OpenShift operator naming. But I would not mind that that much, since we don't really follow OpenShift operator concept at all (with service objects, openshift operator state reporting etc).

Opinions @booxter @SchSeba?

kubemacpool validation doesn't work

I applied invalid kubemacpool attributes, but error wasn't raised:

apiVersion: networkaddonsoperator.network.kubevirt.io/v1alpha1
kind: NetworkAddonsConfig
metadata:      
  creationTimestamp: 2019-04-19T16:54:04Z
  generation: 1
  name: cluster
  resourceVersion: "1313"
  selfLink: /apis/networkaddonsoperator.network.kubevirt.io/v1alpha1/networkaddonsconfigs/cluster
  uid: bd69b15e-62c3-11e9-9d38-525500d15501
spec:
  imagePullPolicy: Always
  kubeMacPool:
    rangeStart: this:aint:right
  linuxBridge: {}
  multus: {}
  sriov: {}
status:
  conditions:
  - lastProbeTime: 2019-04-19T16:55:50Z
    lastTransitionTime: 2019-04-19T16:55:50Z
    status: "False"
    type: Progressing
  - lastProbeTime: 2019-04-19T16:55:50Z
    lastTransitionTime: 2019-04-19T16:55:50Z
    status: "True"
    type: Ready

need to change the deployment command

We need to check the deployment command from cluster-network-addons-operator to entrypoint

Support advanced operator upgrade

Improve operator upgrades. One of the missing features (in #65) that comes to my mind is possibility to remove objects that are not needed anymore. e.g. when previous version deployed a configmap as a part of component X, but in the new version the configmap is not part of X manifests anymore.

Add conditions to kubectl get output

Let's expose conditions via additionalPrinterColumns of our CRD.

networkaddons should use k8s.io/api/core/v1.PullPolicy as a type instead of a string

kubevirt/hyperconverged-cluster-operator#23 (comment)

When MAC range is not requested, fill it in fillDefaults

We should save generated MAC range in networkaddonsconfig object. We do it now, but accidentally (renderManifests takes the config as a reference). This logic should be moved to fillDefaults.

refactor blocking checks in functional tests

#160 (comment)

Add openshift scc

We need to check if we are running under openshift and add scc configuration for every component

Example

Warning  FailedCreate  4m (x22 over 44m)  daemonset-controller  Error creating: pods "kube-multus-ds-amd64-" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used]

Update to OpenShift Network object

Currently we check for existing NetworkConfig object (created by cluster-network-operator) for existing Multus configuration. With latest openshift/cluster-network-operator, this object has been renamed to Network. Sync with this change.

Catch up on openshift network operator

Big part of our code is taken from openshift/cluster-network-operator. Since they found some issues and introduced improvements to that code, we should consume it. Not all of them might be relevant, please double-check.

Support basic operator upgrades

We should be able to upgrade operator and deployed components to a newer version. In the first implementation all we should support is upgrading image name and adding new objects to Kubernetes API. Removals of old objects won't be supported in the initial implementation.

We may need to resolve #34 to fully support upgrades.

correct version, provider and name

I have just tried to deploy cna-operator following ReadMe - Deploy using OLM and using this image.
In the Web UI, I see that

There is no 'Operator' in the name in Operators Catalog. I don't know if there is any spec for operators naming and if it should be there or not.
version 0.0.0
"provided by KubeVirt project". Should it be "Red Hat", maybe?

NetworkAttachmentDefinition CRD and SR-IOV NetworkAttachment CR race

When deploying both Multus and Sriov components at the same time, there is a race where we fail with:

01:58:16  		  conditions:
01:58:16  		  - type: Failing
01:58:16  		    status: "True"
01:58:16  		    lastprobetime: "2019-05-31T23:53:18Z"
01:58:16  		    lasttransitiontime: "2019-05-31T23:47:21Z"
01:58:16  		    reason: FailedToApply
01:58:16  		    message: 'could not apply (k8s.cni.cncf.io/v1, Kind=NetworkAttachmentDefinition)
01:58:16  		      sriov/sriov-network: could not retrieve existing (k8s.cni.cncf.io/v1, Kind=NetworkAttachmentDefinition)
01:58:16  		      sriov/sriov-network: no matches for kind "NetworkAttachmentDefinition" in version
01:58:16  		      "k8s.cni.cncf.io/v1"'

We should improve the code and apply namespace+CRDs before the rest of objects.

Apply the same fix to https://github.com/openshift/cluster-network-operator/blob/master/pkg/controller/operconfig/operconfig_controller.go. Howevern, first make sure this is a common problem, not only OCP 3.11 specific.

nmstate is not working

nmstate DS is full of:

E0516 00:15:47.519995       1 reflector.go:205] github.com/nmstate/kubernetes-nmstate/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1.NodeNetworkState: the server could not find the requested resource (get nodenetworkstates.nmstate.io)

Express ContainerRegistry, ContainerTag, and PullPolicy on the operator's CR

/kind enhancement

Expose ContainerRegistry, ContainerTag, and PullPolicy on the operator's CR.

HCO's configuration proposal

Decrease graceful shutdown period on local cluster

Functional tests spend the most time waiting for previously deployed components to get gracefully removed. Let's decrease graceful shutdown period on the local cluster.

Deploy SR-IOV bits

This issue is to track SR-IOV deployment: CRD, CNI, DP. (Maybe also leveraging other components like kubernetes-nmstate for VF enablement / VFIO configuration.)

The implementation should consider that OpenShift has its own plans to deploy SR-IOV components, and the plans are maybe not compatible with KubeVirt expectations. (Versions shipped too old for KubeVirt.)

OpenShift patch with SR-IOV network type: openshift/cluster-network-operator#84

Unit test coverage for state_manager

State manager contains a lot of non-trivial logic. It should get a test coverage before we start improving it with better reflection of operator state.

While adding components, Condition turns Available too early

When adding a new component to existing config, NetworkAddonsConfig reports Available condition sooner than the component is really available. It is mostly harmless, but should be treated.

Status reporting of pods doesn't work

Changes from OLM integration #42 broke status reporting #29. I am the one in fault, not testing status reporting after rebase. It seems like the problem is in strict RBAC rules.

development documentation in README.md is incorrect

It says

generate sources (requires operator-sdk installed on your host)
operator-sdk generate k8s

but if one tries that, one gets

$ operator-sdk generate k8s
FATA[0000] Must run command in project root dir: project structure requires ./build/Dockerfile

if one moves ./build/operator/Dockerfile to ./build/ it "seems" to work, but don't know if that's correct.

Status is not set to Ready when applying empty Config

When applying empty spec with no requested components, Ready condition is never set to true.

Fixed by #122

Functional test coverage for upgrade

Test that after we upgrade component images listed in operator deployment, all components are eventually upgraded.

KubeMacPool does not stay ready

When enabling KubeMacPool component. Status reports Progressing condition, then it successfully turns Ready. However, after some time (sometimes sooner than in a minute), it triggers operator to turn into Progressing/Failing again.

tests/e2e/deployment_test.go from #118 should include skipped tests that are reproducing the issue.

This issue might be a cause of https://bugzilla.redhat.com/show_bug.cgi?id=1712851, since it caused other components that were deployed at the same time fail to create their resources.

make OLM deployment easier

Currently, marketplace container image for this operator is hosted anywhere. Steps needed to be taken in order to deploy the operator are written in README, but it is a little too complicated.

We should:

Keep OperatorGroup, CatalogSource and Subscription generated under deploy/ and in release
Build registry image and keep it in quay, for both master and versioned
Describe oc way to deploy operator as well as via web UI
Deploy released SVC, not master

Test that AppliedConfig CM logic works

Make sure that our logic around keeping last applied logic persisted as a config map is working.

Add OLM installation section to readme

It should be as simple as installation through http://operatorhub.io/. Just apply a single manifest (2 if you need to install OLM).

While resolving this issue, also address #90 and #73

Make operator manifests vendorable

It should be possible to import our manifests to a different project in form of native kubernetes objects.

We have two options to do that.

A) Generate Go modules from manifests
B) Generate manifests from Go files

Add finalizer

Finalizer should prevent NetworkAddonsConfig object to be removed before all components are successfully removed.

Improve functional tests performance

Functional tests #118 take way to long to finish. Investigate what is the bottleneck. Do we download all needed images every time? Is it the instantiation of daemon sets?

Expose operator using SVC

We should include an SVC manifest in order to integrate with OLM.

Install vendor tools before using them

Mimic nmstate/kubernetes-nmstate#104 and make our development commands faster.

fighting over bridge and plugins path on OpenShift 4

Network operator on OpenShift 4 is deploying binaries of CNI plugins, including bridge and tuning. Our addons operator is trying to deploy different version of bridge and tuning to the same path. Therefore, there is a race for the spot, who is slower, wins.

We have to tackle this issue, maybe by adding a kubevirt- prefix to our binaries.

Add support to operator to manage nmstate state on each node

Future work from #89

nmstate is not being reported in state

nmstate DS has API group extensions/v1beta1, while we monitor only apps/v1.

deployed components should have fixed version

We should use images with specific version instead of :latest. One reason is that we cannot be sure that our manifests will work with whatever is shipped in the image. Another reason is that we make sure all components work well together. Finally it would make upgrades more obvious - upgrade from components:x to components:y.

@SchSeba @booxter does it make sense and is it doable?

TODO: Keep table of image/component versions matched for image release in README