Operator pattern for managing multi-operator products

License: Apache License 2.0

Dockerfile 0.35% Shell 13.47% Go 84.64% Python 1.06% Makefile 0.48%

hyperconverged virtualization kubernetes hco kubernetes-operator openshift operator

hyperconverged-cluster-operator's Introduction

Hyperconverged Cluster Operator

A unified operator deploying and controlling KubeVirt and several adjacent operators:

This operator is typically installed from the Operator Lifecycle Manager (OLM), and creates operator CustomResources (CRs) for its underlying operators as can be seen in the diagram below. Use it to obtain an opinionated deployment of KubeVirt and its helper operators.

In the HCO components doc you can get an up-to-date overview of the involved components.

Installing HCO using kustomize (Openshift OLM Only)

To install the default community HyperConverged Cluster Operator, along with its underlying components, run:

$ curl -L https://api.github.com/repos/kubevirt/hyperconverged-cluster-operator/tarball/main | \
tar --strip-components=1 -xvzf - kubevirt-hyperconverged-cluster-operator-*/deploy/kustomize

$ ./deploy/kustomize/deploy_kustomize.sh

The deployment is completed when HCO custom resource reports its condition as Available.

For more explanation and advanced options for HCO deployment using kustomize, refer to kustomize deployment documentation.

Installing Unreleased Bundle Using A Custom Catalog Source

Hyperconverged Cluster Operator is publishing the latest bundle to quay.io/kubevirt before publishing tagged, stable releases to OperatorHub.io.
The latest bundle is quay.io/kubevirt/hyperconverged-cluster-bundle:1.13.0-unstable. It is built and pushed on every merge to main branch, and contains the most up-to-date manifests, which are pointing to the most recent application images: hyperconverged-cluster-operator and hyperconverged-cluster-webhook, which are built together with the bundle from the current code at the main branch.
The unreleased bundle can be consumed on a cluster by creating a CatalogSource pointing to the index image that contains that bundle: quay.io/kubevirt/hyperconverged-cluster-index:1.13.0-unstable.

Make the bundle available in the cluster's packagemanifest by adding the following CatalogSource:

cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: hco-unstable-catalog-source
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/kubevirt/hyperconverged-cluster-index:1.13.0-unstable
  displayName: Kubevirt Hyperconverged Cluster Operator
  publisher: Kubevirt Project
EOF

Then, create a namespace, subscription and an OperatorGroup to deploy HCO via OLM:

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
    name: kubevirt-hyperconverged
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
    name: kubevirt-hyperconverged-group
    namespace: kubevirt-hyperconverged
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
    name: hco-operatorhub
    namespace: kubevirt-hyperconverged
spec:
    source: hco-unstable-catalog-source
    sourceNamespace: openshift-marketplace
    name: community-kubevirt-hyperconverged
    channel: "candidate-v1.13"
EOF

Then, create the HyperConverged custom resource to complete the installation.
Further information about the HyperConverged CR and its possible configuration options can be found in the Cluster Configuration doc.

Using the HCO without OLM or Marketplace

Run the following script to apply the HCO operator:

$ curl https://raw.githubusercontent.com/kubevirt/hyperconverged-cluster-operator/main/deploy/deploy.sh | bash

Developer Workflow (using OLM)

Build the HCO container using the Makefile recipes make container-build and make container-push with vars IMAGE_REGISTRY, REGISTRY_NAMESPACE, and IMAGE_TAG to direct it's location.

To use the HCO's container, we'll use a registry image to serve metadata to OLM. Build and push the HCO's registry image.

# e.g. quay.io, docker.io
export IMAGE_REGISTRY=<image_registry>
export REGISTRY_NAMESPACE=<container_org>
export IMAGE_TAG=example

# build the container images and push them to registry
make container-build container-push

export HCO_OPERATOR_IMAGE=$IMAGE_REGISTRY/$REGISTRY_NAMESPACE/hyperconverged-cluster-operator:$IMAGE_TAG
export HCO_WEBHOOK_IMAGE=$IMAGE_REGISTRY/$REGISTRY_NAMESPACE/hyperconverged-cluster-webhook:$IMAGE_TAG
export ARTIFACTS_SERVER_IMAGE=$IMAGE_REGISTRY/$REGISTRY_NAMESPACE/virt-artifacts-server:$IMAGE_TAG

# Image to be used in CSV manifests
HCO_OPERATOR_IMAGE=$HCO_OPERATOR_IMAGE CSV_VERSION=$CSV_VERSION make build-manifests
sed -i "s|+WEBHOOK_IMAGE_TO_REPLACE+|${HCO_WEBHOOK_IMAGE}|g" deploy/index-image/community-kubevirt-hyperconverged/1.12.0/manifests/kubevirt-hyperconverged-operator.v1.12.0.clusterserviceversion.yaml
sed -i "s|+ARTIFACTS_SERVER_IMAGE_TO_REPLACE+|${ARTIFACTS_SERVER_IMAGE}|g" deploy/index-image/community-kubevirt-hyperconverged/1.12.0/manifests/kubevirt-hyperconverged-operator.v1.12.0.clusterserviceversion.yaml

Create the namespace for the HCO.

$ kubectl create ns kubevirt-hyperconverged

For the next set of commands, we will use the operator-sdk CLI tool. Below commands will create a bundle image, push this image, and finally use that bundle image to install the HyperConverged cluster Operator on the OLM-enabled cluster.

operator-sdk generate bundle --input-dir deploy/index-image/ --output-dir _out/bundle
operator-sdk bundle validate _out/bundle  # optional
podman build -f deploy/index-image/bundle.Dockerfile -t $IMAGE_REGISTRY/$REGISTRY_NAMESPACE/hyperconverged-cluster-index:$IMAGE_TAG
podman push $IMAGE_REGISTRY/$REGISTRY_NAMESPACE/hyperconverged-cluster-index:$IMAGE_TAG
operator-sdk bundle validate $IMAGE_REGISTRY/$REGISTRY_NAMESPACE/hyperconverged-cluster-index:$IMAGE_TAG
operator-sdk run bundle $IMAGE_REGISTRY/$REGISTRY_NAMESPACE/hyperconverged-cluster-index:$IMAGE_TAG

Create an HCO CustomResource, which creates the KubeVirt CR, launching KubeVirt, CDI (Containerized Data Importer), Network-addons, VM import, TTO (Tekton Tasks Operator) and SSP (Scheduling, Scale and Performance) Operator.

$ kubectl create -f deploy/hco.cr.yaml -n kubevirt-hyperconverged

Create a Cluster & Launch the HCO

Choose the provider

#For k8s cluster:
$ export KUBEVIRT_PROVIDER="k8s-1.17"

#For okd cluster:
$ export KUBEVIRT_PROVIDER="okd-4.1"

Navigate to the project's directory

$ cd <path>/hyperconverged-cluster-operator

Remove an old cluster

$ make cluster-down

Create a new cluster

$ make cluster-up

Clean previous HCO deployment and re-deploy HCO
(When making a change, execute only this command - no need to repeat steps 1-3)

$ make cluster-sync

Command-Line Tool

Use ./cluster/kubectl.sh as the command-line tool.

For example:

$ ./cluster/kubectl.sh get pods --all-namespaces

Deploying HCO on top of external provider

In order to use HCO on top of external provider, i.e CRC, use:

export KUBEVIRT_PROVIDER=external
export IMAGE_REGISTRY=<container image repository, such as quay.io, default: quay.io>
export REGISTRY_NAMESPACE=<your org under IMAGE_REGISTRY, i.e your_name if you use quay.io/your_name, default: kubevirt>
make cluster-sync

oc binary should exist, and the cluster should be reachable via oc commands.

hyperconverged-cluster-operator's People

Contributors

Stargazers

Watchers

Forkers

djzager rthallisey booxter irosenzw arachmani phoracek lveyde lukas-bednar rwsu mareklibra ffromani directedsoul1 codificat davidvossel marsik annastopel mhenriks schseba jparrill mmirecki igoihman mhrivnak sacharya qinqon alexxa danielbelenky adityaramteke rmohr cynepco3hahue jwmatthews stu-gott redhathameed bmozaffa yanirq michaelwinnicki danielerez beekhof oshoval fabiand alonsadan kubevirt-bot deepljh0001 ksimon1 maya-r akoserwal kbidarkar aglitke omeryahud ramlavi orenc1 ilpinto ormergi alonakaplan tareqalayan ashershoshan guy9050 gbenhaim lmilbaum nunnatsa yuvalturg ravidbro vatsalparekh dankenigsberg shawn-hurley augustrh sbulage laashub-soa pkliczewski jakub-dzon rn4sh mpryc brybacki colonwq darkowlzz awgreene yaacov pkesavap zcahana andreyod erkanerol rgolangh enp0s3 clix-dev-llc mansam isabella232 thetechnick kwiesmueller arnongilboa vladikr sradco westcope rnetser ibesso-rh rewantsoni xpivarc tiraboschi 00mjk akalenyu ezio-auditore superleo

hyperconverged-cluster-operator's Issues

HyperConverged Controller needs more test coverage

A majority of the HCO's code lives in the controller and at the time of writing we are at 42% code coverage. This may easily be improved by following the sample memcached operator from operator-sdk-samples.

Missing node maintenance operator CRD

After hitting #66 , I fixed the manifests for getting everything running under the node-maintenance-operator (SA, ClusterRoleBinding, ...), to find out that the CRD is not in this repository.

I found it on the node-maintenance-operator repository, after applying the operator started to run.

Confirm k8s 1.14 allows namespaced CRDs to own global crds

OCP 4.2 will be based on 1.14, so it we won't need #122 if 1.14 supports this.

change sriov-network-type to d/s value in converged

deploy/converged/operator.yaml should change to it's downstream value singe converged is downstream only.

        - name: SRIOV_NETWORK_TYPE
          value: sriov

readme should use kubectl commands vs oc

Hello,

is openshift is a requirement ? if not, could you please provide instructions to install on openshift vs kubernetes ?

thanks!

docker build results in "panic: runtime error: invalid memory address or nil pointer dereference"

I attempted the "Launching the HCO through OLM" flow and hit upon this error at the docker build command:

[rwsu@localhost converged]$ docker build --no-cache -t docker.io/$HCO_DOCKER_ORG/hco-registry:example -f Dockerfile .
Sending build context to Docker daemon 138.2 kB
Step 1/5 : FROM quay.io/openshift/origin-operator-registry
Trying to pull repository quay.io/openshift/origin-operator-registry ... 
sha256:1f04ce4e147c4a25cc9bb8140aaa9fae869a1c5536f16659fb0ba70d2c95809f: Pulling from quay.io/openshift/origin-operator-registry
c2340472a0fa: Pull complete 
6e55351c18ff: Pull complete 
b2d2704dda6c: Pull complete 
a2c10be042b9: Pull complete 
2fb22ae1422d: Pull complete 
Digest: sha256:1f04ce4e147c4a25cc9bb8140aaa9fae869a1c5536f16659fb0ba70d2c95809f
Status: Downloaded newer image for quay.io/openshift/origin-operator-registry:latest
 ---> ef8b579a36cf
Step 2/5 : COPY olm-catalog /registry
 ---> 1bfa961b7e49
Removing intermediate container 084e7d059d2f
Step 3/5 : RUN initializer --manifests /registry --output bundles.db
 ---> Running in bfe375b7a6d5
time="2019-05-06T23:37:34Z" level=info msg="validating manifests" dir=/registry
validate /registry/kubevirt-hyperconverged/0.0.1/cdi-operator.crb.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/cdi.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/cna-operator.crb.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/cna.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/common-template-bundles.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/hco.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/kubevirt-hyperconverged-operator.v0.0.1.clusterserviceversion.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/kubevirt.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/kwebui.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/node-labeller-bundles.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/node-maintenance.crd.yaml
validate /registry/kubevirt-hyperconverged/0.0.1/template-validator.crd.yaml
validate /registry/kubevirt-hyperconverged/kubevirt-hyperconverged.package.yaml
time="2019-05-06T23:37:34Z" level=info msg="loading Bundles" dir=/registry
time="2019-05-06T23:37:34Z" level=info msg=directory dir=/registry file=registry load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=bundles.db load=bundles
time="2019-05-06T23:37:34Z" level=info msg="could not decode contents of file /registry/bundles.db into package: error converting YAML to JSON: yaml: control characters are not allowed" dir=/registry file=bundles.db load=bundles
time="2019-05-06T23:37:34Z" level=info msg=directory dir=/registry file=kubevirt-hyperconverged load=bundles
time="2019-05-06T23:37:34Z" level=info msg=directory dir=/registry file=0.0.1 load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=cdi-operator.crb.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=cdi.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="could not decode contents of file /registry/kubevirt-hyperconverged/0.0.1/cdi.crd.yaml into package: error unmarshaling JSON: v1alpha1 is not in dotted-tri format" dir=/registry file=cdi.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=cna-operator.crb.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=cna.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="could not decode contents of file /registry/kubevirt-hyperconverged/0.0.1/cna.crd.yaml into package: error unmarshaling JSON: v1alpha1 is not in dotted-tri format" dir=/registry file=cna.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=common-template-bundles.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="could not decode contents of file /registry/kubevirt-hyperconverged/0.0.1/common-template-bundles.crd.yaml into package: error unmarshaling JSON: v1 is not in dotted-tri format" dir=/registry file=common-template-bundles.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=hco.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="could not decode contents of file /registry/kubevirt-hyperconverged/0.0.1/hco.crd.yaml into package: error unmarshaling JSON: v1alpha1 is not in dotted-tri format" dir=/registry file=hco.crd.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=kubevirt-hyperconverged-operator.v0.0.1.clusterserviceversion.yaml load=bundles
time="2019-05-06T23:37:34Z" level=info msg="loading bundle file" dir=/registry file=cdi-operator.crb.yaml load=bundle
time="2019-05-06T23:37:34Z" level=info msg="found csv, loading bundle" dir=/registry file=cdi-operator.crb.yaml load=bundle
time="2019-05-06T23:37:34Z" level=info msg="could not decode contents of file /registry/kubevirt-hyperconverged/0.0.1/cdi-operator.crb.yaml into file: error unmarshaling JSON: Object 'Kind' is missing in 'null'" dir=/registry file=cdi-operator.crb.yaml load=bundle
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xc96165]

goroutine 1 [running]:
github.com/operator-framework/operator-registry/pkg/registry.(*Bundle).Size(...)
	/go/src/github.com/operator-framework/operator-registry/pkg/registry/bundle.go:69
github.com/operator-framework/operator-registry/pkg/sqlite.(*DirectoryLoader).LoadBundleWalkFunc(0xc4204c6160, 0xc42008e070, 0x6a, 0x10abfa0, 0xc420344270, 0x0, 0x0, 0xc42008e0e0, 0xc4203442a8)
	/go/src/github.com/operator-framework/operator-registry/pkg/sqlite/directory.go:113 +0x7d5
github.com/operator-framework/operator-registry/pkg/sqlite.(*DirectoryLoader).LoadBundleWalkFunc-fm(0xc42008e070, 0x6a, 0x10abfa0, 0xc420344270, 0x0, 0x0, 0x42, 0xc4205d1858)
	/go/src/github.com/operator-framework/operator-registry/pkg/sqlite/directory.go:50 +0x69
path/filepath.walk(0xc42008e070, 0x6a, 0x10abfa0, 0xc420344270, 0xc4205d1c00, 0x0, 0x0)
	/usr/local/go/src/path/filepath/path.go:357 +0x402
path/filepath.walk(0xc420037e60, 0x27, 0x10abfa0, 0xc420279450, 0xc4204f1c00, 0x0, 0x0)
	/usr/local/go/src/path/filepath/path.go:381 +0x2c2
path/filepath.walk(0xc420037d40, 0x21, 0x10abfa0, 0xc4202792b0, 0xc4204f1c00, 0x0, 0x0)
	/usr/local/go/src/path/filepath/path.go:381 +0x2c2
path/filepath.walk(0x7ffee6fbdedb, 0x9, 0x10abfa0, 0xc420279040, 0xc4204f1c00, 0x0, 0x0)
	/usr/local/go/src/path/filepath/path.go:381 +0x2c2
path/filepath.Walk(0x7ffee6fbdedb, 0x9, 0xc420541c00, 0x1, 0xc420101800)
	/usr/local/go/src/path/filepath/path.go:403 +0x106
github.com/operator-framework/operator-registry/pkg/sqlite.(*DirectoryLoader).Populate(0xc4204c6160, 0xc4204c6160, 0xc420106238)
	/go/src/github.com/operator-framework/operator-registry/pkg/sqlite/directory.go:50 +0x16e
main.runCmdFunc(0xc4200e1680, 0xc4204f2700, 0x0, 0x4, 0x0, 0x0)
	/go/src/github.com/operator-framework/operator-registry/cmd/initializer/main.go:54 +0x1e6
github.com/operator-framework/operator-registry/vendor/github.com/spf13/cobra.(*Command).execute(0xc4200e1680, 0xc420030150, 0x4, 0x4, 0xc4200e1680, 0xc420030150)
	/go/src/github.com/operator-framework/operator-registry/vendor/github.com/spf13/cobra/command.go:762 +0x468
github.com/operator-framework/operator-registry/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc4200e1680, 0xff988e, 0x35, 0xc4203d9550)
	/go/src/github.com/operator-framework/operator-registry/vendor/github.com/spf13/cobra/command.go:852 +0x30a
github.com/operator-framework/operator-registry/vendor/github.com/spf13/cobra.(*Command).Execute(0xc4200e1680, 0xfca11a, 0x5)
	/go/src/github.com/operator-framework/operator-registry/vendor/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
	/go/src/github.com/operator-framework/operator-registry/cmd/initializer/main.go:32 +0x22b
The command '/bin/sh -c initializer --manifests /registry --output bundles.db' returned a non-zero code: 2

Component CR reconciliation

The HCO expects to own the lifecycle of all component CRs and nothing else. It should handle these scenarios:

Create a CR that doesn't exists
Smash a modified CR back to the HCO's configuration
Replace an already existing CR with an HCO owned CR (owner reference)
Prevent component operator CR creation

deployment script creates the node-maintenance-operator

Since 666b7fe, the node maintenance operator uses the HCO's namespace, but the script still creates the node-maintenance-operator one, which AFAIK, it's not required.

No resources created for network-addons using HCO on minikube

Steps followed using info on -> https://github.com/kubevirt/hyperconverged-cluster-operator
Minikube version
root@shegde🎩minikube version minikube version: v0.35.0

After creating HCO Im observing this below resources

~/minikube/hco-network-operator/hyperconverged-cluster-operator  root@shegde🎩kubectl get networkaddonsconfig --all-namespaces
No resources found.
~/minikube/hco-network-operator/hyperconverged-cluster-operator  root@shegde🎩kubectl get kubevirt --all-namespaces
NAMESPACE   NAME                              AGE   PHASE
kubevirt    kubevirt-hyperconverged-cluster   23h   Deployed
~/minikube/hco-network-operator/hyperconverged-cluster-operator  root@shegde🎩kubectl get cdi --all-namespaces
NAME                         AGE
cdi-hyperconverged-cluster   23h

Observed behaviour is to create pods for multus and linux bridge which is missing

~/minikube/hco-network-operator/hyperconverged-cluster-operator  root@shegde🎩kubectl get pods --all-namespaces
NAMESPACE                         NAME                                               READY   STATUS    RESTARTS   AGE
cdi                               cdi-apiserver-79b8756b98-mlz28                     1/1     Running   0          24h
cdi                               cdi-deployment-55d9f74857-kw55b                    1/1     Running   0          24h
cdi                               cdi-operator-7887f66fb6-cdq5x                      1/1     Running   0          24h
cdi                               cdi-uploadproxy-c6bbbc648-c9xkl                    1/1     Running   0          24h
cluster-network-addons-operator   cluster-network-addons-operator-6578659dfd-pnn57   1/1     Running   0          24h
kube-system                       coredns-86c58d9df4-5sllw                           1/1     Running   0          24h
kube-system                       coredns-86c58d9df4-qvzvp                           1/1     Running   0          24h
kube-system                       etcd-minikube                                      1/1     Running   0          24h
kube-system                       kube-addon-manager-minikube                        1/1     Running   0          24h
kube-system                       kube-apiserver-minikube                            1/1     Running   0          24h
kube-system                       kube-controller-manager-minikube                   1/1     Running   0          24h
kube-system                       kube-proxy-rl9zv                                   1/1     Running   0          24h
kube-system                       kube-scheduler-minikube                            1/1     Running   0          24h
kube-system                       storage-provisioner                                1/1     Running   0          24h
kubevirt                          hyperconverged-cluster-operator-57b4bc9c5f-njm26   1/1     Running   6          24h
kubevirt                          virt-api-649859444c-9x7jh                          1/1     Running   1          24h
kubevirt                          virt-api-649859444c-mqjq9                          1/1     Running   0          24h
kubevirt                          virt-controller-7f49b8f77c-8kdct                   1/1     Running   0          24h
kubevirt                          virt-controller-7f49b8f77c-mgp6r                   1/1     Running   0          24h
kubevirt                          virt-handler-wpdfr                                 1/1     Running   0          24h
kubevirt                          virt-operator-6c5db798d4-t2v5w                     1/1     Running   0          24h

Ill be happy to provide if any details are needed.

kubevirt-hyperconverged project stuck at Terminating state in OCP 3.11

When trying to delete custom resource definition named cdis.cdi.kubevirt.io the process hangs.

Environment
Openshift 3.11

[root@shedge-master0 ~]# oc version
oc v3.11.104
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://shedge-master0.cnv-comm.10.8.120.140.nip.io:8443
openshift v3.11.104
kubernetes v1.11.0+d4cacc0

Able to manually delete the resources namely pods, services, role and service accounts
But when trying to delete a crd named cdis.cdi.kubevirt.io the process just hangs , i tried to delete the project oc delete project kubevirt-hyperconverged but it just goes to terminating state and cannot re-deploy other resources again.

root@shedge-master0 ~]# oc get projects
NAME                                DISPLAY NAME   STATUS
cluster-network-addons-operator                    Active
default                                            Active
glusterfs                                          Active
kube-public                                        Active
kube-service-catalog                               Active
kube-system                                        Active
kubevirt-hyperconverged                            Terminating

[root@shedge-master0 ~]# oc get all
No resources found.
[root@shedge-master0 ~]# oc project
Using project "kubevirt-hyperconverged" on server "https://shedge-master0.cnv-comm.10.8.120.140.nip.io:8443"

[root@shedge-master0 ~]# oc get all
No resources found.
[root@shedge-master0 ~]# oc project
Using project "kubevirt-hyperconverged" on server "https://shedge-master0.cnv-comm.10.8.120.140.nip.io:8443".
[root@shedge-master0 ~]# oc logs crds/cdis.cdi.kubevirt.io
error: no kind "CustomResourceDefinition" is registered for version "apiextensions.k8s.io/v1beta1" in scheme "k8s.io/kubernetes/pkg/kubectl/scheme/scheme.go:28"
[root@shedge-master0 ~]# oc get crds
NAME                                  CREATED AT
bundlebindings.automationbroker.io    2019-04-11T20:45:07Z
bundleinstances.automationbroker.io   2019-04-11T20:45:10Z
bundles.automationbroker.io           2019-04-11T20:45:13Z
cdis.cdi.kubevirt.io                  2019-04-12T14:51:12Z
[root@shedge-master0 ~]#

NETWORK_ADDONS_CONTAINER_REGISTRY parameter doesn't take effect

Using following config file:

cat > hack/config << __EOF__
#!/bin/bash

WAIT_TIMEOUT="300s"

KUBEVIRT_VERSION="${KUBEVIRT_VERSION}"
CDI_VERSION="${CDI_VERSION}"
NETWORK_ADDONS_VERSION="${NETWORK_ADDONS_VERSION}"

CONTAINER_REGISTRY="brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888"
CONTAINER_PREFIX="container-native-virtualization"

CDI_CONTAINER_REGISTRY="\${CONTAINER_REGISTRY}/\${CONTAINER_PREFIX}"
KUBEVIRT_CONTAINER_REGISTRY="\${CONTAINER_REGISTRY}/\${CONTAINER_PREFIX}"
NETWORK_ADDONS_CONTAINER_REGISTRY="\${CONTAINER_REGISTRY}/\${CONTAINER_PREFIX}"

CDI_OPERATOR_NAME="virt-cdi-operator"
__EOF__

Then running hack/operator-test.sh script and I can see that network operator is taken from quay.io anyway

[root@cnv-qe-03 ~]# oc get pods --all-namespaces -o yaml | grep -i image: | sort | uniq | grep cluster-network-addons
      image: quay.io/kubevirt/cluster-network-addons-operator:v0.3.0

/cc @phoracek

Don't block on a the first error we see in Reconcile

Reconcile will return on the first error it runs into https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/pkg/controller/hyperconverged/hyperconverged_controller.go#L135. Instead of returning on the first error, completely go through the reconcile loop and attempt to resolve other CR, list out the components that errored, then return.

kubevirt-web-ui: Failed to pull image "quay.io/kubevirt/kubevirt-web-ui:v0.1.10": rpc error: code = Unknown desc = Error reading manifest v0.1.10 in quay.io/kubevirt/kubevirt-web-ui: manifest unknown: manifest unknown

Related to #143, retrying installing the latest version of hco from master on kubevirtci I wanted to connect to kubevirt console and fetched the route:

$ oc get route  console -n kubevirt-web-ui                                                                                                                                              1 ↵
NAME      HOST/PORT                                PATH   SERVICES   PORT    TERMINATION          WILDCARD
console   kubevirt-web-ui.apps.test-1.tt.testing          console    https   reencrypt/Redirect   None

Trying to browse towards this produced this output:

Application is not available
The application is currently not serving requests at this endpoint. It may not have been started or is still starting.

$ curl -v --insecure --HEAD https://kubevirt-web-ui.apps.test-1.tt.testing
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to kubevirt-web-ui.apps.test-1.tt.testing (127.0.0.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=*.apps.test-1.tt.testing
*  start date: Jun 23 12:01:30 2019 GMT
*  expire date: Jun 22 12:01:31 2021 GMT
*  issuer: CN=ingress-operator@1561291288
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
> HEAD / HTTP/1.1
> Host: kubevirt-web-ui.apps.test-1.tt.testing
> User-Agent: curl/7.64.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
HTTP/1.0 503 Service Unavailable
< Pragma: no-cache
Pragma: no-cache
< Cache-Control: private, max-age=0, no-cache, no-store
Cache-Control: private, max-age=0, no-cache, no-store
< Connection: close
Connection: close
< Content-Type: text/html
Content-Type: text/html

< 
* Excess found in a non pipelined read: excess = 3131 url = / (zero-length body)
* Closing connection 0

I got the following logs for kubevirt-web-ui console:

$ kubectl get pods -n kubevirt-web-ui                         
NAME                       READY   STATUS             RESTARTS   AGE
console-64544b6686-hgvgk   0/1     ImagePullBackOff   0          18m
console-64544b6686-lcjnx   0/1     ImagePullBackOff   0          18m
$ kubectl logs -n kubevirt-web-ui console-64544b6686-hgvgk
Error from server (BadRequest): container "console" in pod "console-64544b6686-hgvgk" is waiting to start: trying and failing to pull image

Fortunately the okd console was up, so I went to https://console-openshift-console.apps.test-1.tt.testing/k8s/ns/kubevirt-web-ui/pods/console-64544b6686-hgvgk/events and there I found what image was the problem:

Failed to pull image "quay.io/kubevirt/kubevirt-web-ui:v0.1.10": rpc error: code = Unknown desc = Error reading manifest v0.1.10 in quay.io/kubevirt/kubevirt-web-ui: manifest unknown: manifest unknown

kubevirt-ssp-operator failed to load due to authentication or permission failure

After deploying the HCO , the following error appears in the kubevirt-ssp-operator log:

Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\".

oc log kubevirt-ssp-operator-56566664dd-p2dmp
kubevirt.io/rhel7.0: Red Hat Enterprise Linux 7.0\\n    name.os.template.kubevirt.io/rhel7.1: Red Hat Enterprise Linux 7.1\\n    name.os.template.kubevirt.io/rhel7.2: Red Hat Enterprise Linux 7.2\\n    name.os.template.kubevirt.io/rhel7.3: Red Hat Enterprise Linux 7.3\\n    name.os.template.kubevirt.io/rhel7.4: Red Hat Enterprise Linux 7.4\\n    name.os.template.kubevirt.io/rhel7.5: Red Hat Enterprise Linux 7.5\\n    name.os.template.kubevirt.io/rhel7.6: Red Hat Enterprise Linux 7.6\\n    openshift.io/display-name: Red Hat Enterprise Linux 7.0+ VM\\n    openshift.io/documentation-url: https://github.com/kubevirt/common-templates\\n    openshift.io/provider-display-name: KubeVirt\\n    openshift.io/support-url: https://github.com/kubevirt/common-templates/issues\\n    tags: kubevirt,virtualmachine,linux,rhel\\n    template.kubevirt.io/editable: '/objects[0].spec.template.spec.domain.cpu.sockets\\n\\n      /objects[0].spec.template.spec.domain.cpu.cores\\n\\n      /objects[0].spec.template.spec.domain.cpu.threads\\n\\n      /objects[0].spec.template.spec.domain.resources.requests.memory\\n\\n      /objects[0].spec.template.spec.domain.devices.disks\\n\\n      /objects[0].spec.template.spec.volumes\\n\\n      /objects[0].spec.template.spec.networks\\n\\n      '\\n    template.kubevirt.io/version: v1alpha1\\n    template.openshift.io/bindable: 'false'\\n  labels:\\n    flavor.template.kubevirt.io/small: 'true'\\n    os.template.kubevirt.io/rhel7.0: 'true'\\n    os.template.kubevirt.io/rhel7.1: 'true'\\n    os.template.kubevirt.io/rhel7.2: 'true'\\n    os.template.kubevirt.io/rhel7.3: 'true'\\n    os.template.kubevirt.io/rhel7.4: 'true'\\n    os.template.kubevirt.io/rhel7.5: 'true'\\n    os.template.kubevirt.io/rhel7.6: 'true'\\n    template.kubevirt.io/type: base\\n    workload.template.kubevirt.io/server: 'true'\\n  name: rhel7-server-small\\nobjects:\\n- apiVersion: kubevirt.io/v1alpha3\\n  kind: VirtualMachine\\n  metadata:\\n    labels:\\n      app: ${NAME}\\n      vm.kubevirt.io/template: rhel7-server-small\\n    name: ${NAME}\\n  spec:\\n    running: false\\n    template:\\n      metadata:\\n        labels:\\n          kubevirt.io/domain: ${NAME}\\n          kubevirt.io/size: small\\n      spec:\\n        domain:\\n          cpu:\\n            cores: 1\\n            sockets: 1\\n            threads: 1\\n          devices:\\n            disks:\\n            - disk:\\n                bus: virtio\\n              name: rootdisk\\n            - disk:\\n                bus: virtio\\n              name: cloudinitdisk\\n            interfaces:\\n            - bridge: {}\\n              name: default\\n            rng: {}\\n          resources:\\n            requests:\\n              memory: 2G\\n        networks:\\n        - name: default\\n          pod: {}\\n        terminationGracePeriodSeconds: 0\\n        volumes:\\n        - name: rootdisk\\n          persistentVolumeClaim:\\n            claimName: ${PVCNAME}\\n        - cloudInitNoCloud:\\n            userData: '#cloud-config\\n\\n              password: redhat\\n\\n              chpasswd: { expire: False }'\\n          name: cloudinitdisk\\nparameters:\\n- description: VM name\\n  from: rhel7-[a-z0-9]{16}\\n  generate: expression\\n  name: NAME\\n- description: Name of the PVC with the disk image\\n  name: PVCNAME\\n  required: true\", \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /.ansible/tmp/ansible-tmp-1563095866.35-35005751701263 `\\\" && echo ansible-tmp-1563095866.35-35005751701263=\\\"` echo /.ansible/tmp/ansible-tmp-1563095866.35-35005751701263 `\\\" ), exited with result 1\", \"unreachable\": true}, {\"_ansible_ignore_errors\": null, \"_ansible_item_label\": \"apiVersion: template.openshift.io/v1\\nkind: templates\\nmetadata:\\n  annotations:\\n    defaults.template.kubevirt.io/disk: rootdisk\\n    description: This template can be used to create a VM suitable for Red Hat Enterprise\\n      Linux 7 and newer. The template assumes that a PVC is available which is providing\\n      the necessary RHEL disk image.\\n    iconClass: icon-rhel\\n

Remove use of CatalogSourceConfig

operator-framework/operator-lifecycle-manager#871

CatalogSourceConfig object is no longer used.

hco pods is running, but logs are spammed by cdi errors

hco depolyment went without hiccup, but after 10-15 of running cluster cdi lost most of pods, only cdi-operator is running, so I went to check hco pod, it is running but logs are spammed by:

{"level":"error","ts":1553704183.7050228,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"hyperconverged-controller","request":"kubevirt/hyperconverged-cluster","error":"cdis.cdi.kubevirt.io "cdi-hyperconverged-cluster" already exists","stacktrace":"github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/rhallisey/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/rhallisey/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/rhallisey/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/home/rhallisey/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/rhallisey/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/rhallisey/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Cannot customize the kubevirt additional parameters to integrate with prometheus

The kubevirt created by the operator cannot be configured to point at a non-default namespace/service account when integrating with the prometheus operator. Kubevirt allows for such customizations via "AdditionalParameters" as documented here

For my deployment, I followed this after previously deploying the prometheus operator.

This is causing a failure of the kubevirt and prevents it from deploying

- lastProbeTime: "2019-10-18T15:48:03Z"
      lastTransitionTime: "2019-10-18T15:48:03Z"
      message: 'An error occurred during deployment: unable to create serviceMonitor
        &{TypeMeta:{Kind:ServiceMonitor APIVersion:monitoring.coreos.com} ObjectMeta:{Name:kubevirt
        GenerateName: Namespace:openshift-monitoring SelfLink: UID: ResourceVersion:
        Generation:0 CreationTimestamp:0001-01-01 00:00:00 +0000 UTC DeletionTimestamp:<nil>
        DeletionGracePeriodSeconds:<nil> Labels:map[app.kubernetes.io/managed-by:kubevirt-operator
        k8s-app:kubevirt openshift.io/cluster-monitoring: prometheus.kubevirt.io:]
        Annotations:map[kubevirt.io/install-strategy-version:v0.20.8 kubevirt.io/install-strategy-registry:[internal-docker-mirror]/kubevirt
        kubevirt.io/install-strategy-identifier:edec3533003a9397eaa349472c1208328b7f822d]
        OwnerReferences:[{APIVersion:kubevirt.io/v1alpha3 Kind:KubeVirt Name:kubevirt-hyperconverged-cluster
        UID:463b8240-c0d6-4f5c-861f-2caa72353254 Controller:0xc00121f588 BlockOwnerDeletion:0xc00121f587}]
        Initializers:nil Finalizers:[] ClusterName:} Spec:{JobLabel: TargetLabels:[]
        PodTargetLabels:[] Endpoints:[{Port:metrics TargetPort:<nil> Path: Scheme:https
        Params:map[] Interval: ScrapeTimeout: TLSConfig:0xc00144a410 BearerTokenFile:
        HonorLabels:false BasicAuth:<nil> MetricRelabelConfigs:[] RelabelConfigs:[]
        ProxyURL:<nil>}] Selector:{MatchLabels:map[prometheus.kubevirt.io:] MatchExpressions:[]}
        NamespaceSelector:{Any:false MatchNames:[kubevirt-hyperconverged]} SampleLimit:0}}:
        namespaces "openshift-monitoring" not found'
      reason: DeploymentFailed
      status: "False"
      type: Synchronized

The problem here is clearly related to not being able to point the kubevirt at the namespace/service account that's actually being used by the prometheus operator. openshift-monitoring is the default value.

I found the documentation here on configuration not particularly helpful, so I dug into the code.

The kubevirt object seems to be defined in here but there does not appear to be a way to pass the needed AdditionalParameters through.

An extremely gross work around would be to move our prometheus operator and rename our service accounts to match the kubevirts defaults -- but it would be much more friendly if this could be just be configured.

Labeller pods are not created

After running the attached deploy.sh script there are no node labeller nodes:

$ oc get pods
NAME                                              READY   STATUS    RESTARTS   AGE
cdi-apiserver-799b86cd47-blq9l                    1/1     Running   0          3h55m
cdi-deployment-67855b764d-lhc47                   1/1     Running   0          3h55m
cdi-operator-86cfbc4f55-hzv98                     1/1     Running   0          3h55m
cdi-uploadproxy-7cd5bdb789-9jlfr                  1/1     Running   0          3h55m
cluster-network-addons-operator-c78d4f6fc-m86r7   1/1     Running   0          3h55m
hyperconverged-cluster-operator-d6c986564-p6lqd   1/1     Running   0          3h55m
kubevirt-ssp-operator-c4cd5f564-lp8mj             1/1     Running   0          3h55m
kubevirt-web-ui-operator-59f778956-jkvlr          1/1     Running   0          3h55m
node-maintenance-operator-6b6d5756d8-gbnw7        1/1     Running   0          3h55m
virt-operator-8fd6c4c4c-hj6x2                     1/1     Running   0          3h55m
virt-operator-8fd6c4c4c-ts58n                     1/1     Running   0          3h55m

$ virtctl version
Client Version: version.Info{GitVersion:"v0.17.4", GitCommit:"adfdb8c07830b99fc79d2fd1d004e862ef70979e", GitTreeState:"clean", BuildDate:"2019-06-27T17:03:10Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Don't hardcode namespaces for component CRs

Kubevirt and CDI CRs are created in the 'kubevirt' and 'cdi' namespaces respectively. Instead, create them in the current namespace. Also, update the README to reflect this.

This issue depends on the unified CSV file.

Move HCO CustomResourceDefinition to be globally scoped

We're currently using a namespaced CRD while managing CRDs that are globally scoped. Kubernetes 1.14 allows for namespaced CRDs to own global CRDs, however it might be worth switching.

The advantages I see are

No longer give a less privileged object ownership of a more privileged object
GETing the hco looks more like a singleton

Saw this from controller-runtime. The first request errors without a namespace then re queues with the namespace.

{"level":"error","ts":1560885589.9304435,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"hyperconverged-controller","request":"/hyperconverged-cluster-operator","error":"an empty namesp\
ace may not be set during creation","stacktrace":"github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-oper\
ator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/gi\
thub.com/kubevirt/hyperconverged-cluster-operator/vendorGive a better picture for/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controll\
er-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithu\
b.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/\
wait.go:133\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pk\
g/util/wait/wait.go:134\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachin\
ery/pkg/util/wait/wait.go:88"}
{"level":"info","ts":1560885590.9308822,"logger":"controller_hyperconverged","msg":"Reconciling HyperConverged operator","Request.Namespace":"kubevirt-hyperconverged","Request.Name":"hyperconverged-cluster-operator"}

error validating "deploy/converged/cluster_role.yaml"

Following the steps in the README, I came across this other error when doing "kubectl create -f deploy/converged"

error: error validating "deploy/converged/cluster_role.yaml": error validating data: ValidationError(ClusterRole.rules[9]): unknown field "resourceName" in io.k8s.api.rbac.v1.PolicyRule; if you choose to ignore these errors, turn validation off with --validate=false
```

HCO deploy fails with k8s provider

HCO deploy fails (Ready 0/1) when using k8s provider with the following error:

{"level":"error","ts":1565535648.8084443,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"hyperconverged-controller","request":"kubevirt-hyperconverged/hyperconverged-cluster","error":"namespaces "openshift" not found","stacktrace":"github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Provide a way to get all the container sha sums/tags in a registry container

The CSV files is tucked away in a container and it's hard to understand exactly what the contents are. A nice way to get the contents would be for the HCO to add a file into the registry container with a list of key: value pairs of operators and their tag/sha sums. That way, users can figure out exactly the content of the container-registry by running something like docker run hco-registry cat /csv-content

The CSV 0.0.1 uses "latest" floating tag

Looking at the 0.0.1 CSV, we see that the operator image tag is "latest":
https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/deploy/olm-catalog/kubevirt-hyperconverged/0.0.1/kubevirt-hyperconverged-operator.v0.0.1.clusterserviceversion.yaml#L73

This means that if we try to deploy v0.0.1 (e.g. to test the upgrade flow) we will get actually a 0.0.2 image. To run, this image will need additional CRDs, like machine remediation operator and metrics aggregation, which are 0.0.2 features.

The suggested fix is push a 0.0.1 image (stream) and fix the CSV accordingly.
I see this image published on quay.io: https://quay.io/repository/kubevirt/hyperconverged-cluster-operator?tab=tags v0.0.1 but I'm not sure it is good.

Error creating network add on pod

Steps followed and environment used

Openshift

Using Openshift 3.11

[root@shedge-master0 hyperconverged-cluster-operator]# oc version
oc v3.11.104
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://shedge-master0.cnv-comm.10.8.120.140.nip.io:8443
openshift v3.11.104
kubernetes v1.11.0+d4cacc0

After deploying all the CRD's and CR for HCO , network add-on pod fails

[root@shedge-master0 hyperconverged-cluster-operator]# oc get all
NAME                                                   READY     STATUS             RESTARTS   AGE
pod/cdi-apiserver-64d8d44d4-cj7wq                      1/1       Running            0          1h
pod/cdi-deployment-6b997bddf6-lhm68                    1/1       Running            0          1h
pod/cdi-operator-65767d954d-v9tlk                      1/1       Running            0          1h
pod/cdi-uploadproxy-848f576f98-s4mjz                   1/1       Running            0          1h
pod/cluster-network-addons-operator-5f4d4854bd-k2b75   0/1       CrashLoopBackOff   20         1h
pod/hyperconverged-cluster-operator-6b7ccc5df8-wzzdx   1/1       Running            0          1h
pod/kubevirt-hyperconverged-cluster-jobfkgxl-xb9l5     0/1       ImagePullBackOff   0          1h
pod/virt-operator-86b75548c-gdcht                      1/1       Running            0          1h

NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/cdi-api           ClusterIP   172.30.52.160    <none>        443/TCP   1h
service/cdi-uploadproxy   ClusterIP   172.30.188.198   <none>        443/TCP   1h

NAME                                              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cdi-apiserver                     1         1         1            1           1h
deployment.apps/cdi-deployment                    1         1         1            1           1h
deployment.apps/cdi-operator                      1         1         1            1           1h
deployment.apps/cdi-uploadproxy                   1         1         1            1           1h
deployment.apps/cluster-network-addons-operator   1         1         1            0           1h
deployment.apps/hyperconverged-cluster-operator   1         1         1            1           1h
deployment.apps/virt-operator                     1         1         1            1           1h

NAME                                                         DESIRED   CURRENT   READY     AGE
replicaset.apps/cdi-apiserver-64d8d44d4                      1         1         1         1h
replicaset.apps/cdi-deployment-6b997bddf6                    1         1         1         1h
replicaset.apps/cdi-operator-65767d954d                      1         1         1         1h
replicaset.apps/cdi-uploadproxy-848f576f98                   1         1         1         1h
replicaset.apps/cluster-network-addons-operator-5f4d4854bd   1         1         0         1h
replicaset.apps/hyperconverged-cluster-operator-6b7ccc5df8   1         1         1         1h
replicaset.apps/virt-operator-86b75548c                      1         1         1         1h

NAME                                                 DESIRED   SUCCESSFUL   AGE
job.batch/kubevirt-hyperconverged-cluster-jobfkgxl   1         0            1h

Logs for the pod( network addon is below):

[root@shedge-master0 hyperconverged-cluster-operator]# oc logs pod/cluster-network-addons-operator-5f4d4854bd-k2b75
2019/04/12 16:10:27 Go Version: go1.10.8
2019/04/12 16:10:27 Go OS/Arch: linux/amd64
2019/04/12 16:10:27 version of operator-sdk: v0.5.0+git
2019/04/12 16:10:27 registering Components
2019/04/12 16:10:27 failed setting up operator controllers: environment variable OPERATOR_NAMESPACE has to be set

There was no mention of the variable OPERATOR_NAMESPACE anywhere in the README.md

Monitor SSP's conditions

The HCO doesn't currently monitor SSP's conditions. Add the conditions to https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/pkg/controller/hyperconverged/hyperconverged_controller.go#L820

KubeVirt Web UI Operator fails to deploy

I've tried to install HCO on OpenShift 4.2 (26c5e504fc48e8426750e3d3e4f46aa9b91b01e6), the install completed without errors, but the web ui does not show up.

$ kubectl -n kubevirt-hyperconverged logs kubevirt-web-ui-operator-59f778956-dxslz | grep '"level":"error"' | jq .                                                                 [68/5168]
{                                                                                                                                                                                             
  "level": "error",                                                                                                                                                                           
  "ts": 1561563230.7800148,                                                                                                                                                                   
  "logger": "controller_kwebui",                                                                                                                                                              
  "caller": "kwebui/kwebui_controller.go:109",                                                                                                                                                
  "msg": "Looking for the console Deployment object",                                                                                                                                           "Request.Namespace": "",                                                                                                                                                                      "Request.Name": "kubevirt-web-ui-hyperconverged-cluster",                                                                                                                                     "error": "Deployment.extensions \"console\" not found",                                                                                                                                       "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.(*ReconcileKWebUI).Reconcile\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/kwebui_controller.go:109\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:207\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wa
it/wait.go:134\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/u
til/wait/wait.go:88"                                                                                                                                                                          
}                                                                                                                                                                                             
{
  "level": "error",
  "ts": 1561563230.7912967,
  "logger": "controller_kwebui",
  "caller": "kwebui/provision.go:280",
  "msg": "Failed to update KWebUI status. Intended to write phase: 'PROVISION_STARTED', message: Target version: latest",
  "error": "Operation cannot be fulfilled on kwebuis.kubevirt.io \"kubevirt-web-ui-hyperconverged-cluster\": the object has been modified; please apply your changes to the latest version and
 try again",
  "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/za
pr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.updateStatus\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:280\ngith
ub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.freshProvision\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:110\ngithub.com/kubevirt/we
b-ui-operator/pkg/controller/kwebui.(*ReconcileKWebUI).Reconcile\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/kwebui_controller.go:111\ngithub.com/kubevir
t/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.i
o/controller-runtime/pkg/internal/controller/controller.go:207\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\
t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/ap
imachinery/pkg/util/wait.JitterUntil.func1\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/web-ui-operato
r/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubevirt/we
b-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"
}
{
  "level": "error",
  "ts": 1561563230.9976366,
  "logger": "controller_kwebui",
  "caller": "kwebui/helper.go:29",
  "msg": "stdout read error",
  "error": "read |0: file already closed",
  "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/za
pr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.pipeToLog\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/helper.go:29"
}
{
  "level": "error",
  "ts": 1561563230.9977138,
  "msg": "stdout read error",
  "error": "read |0: file already closed",
  "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/za
pr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.pipeToLog\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/helper.go:29"
}
{
  "level": "error",
  "ts": 1561563231.1799755,
  "logger": "controller_kwebui",
  "caller": "kwebui/helper.go:51",
  "msg": "Execution failed (wait): oc project kubevirt-web-ui",
  "error": "exit status 1",
  "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/za
pr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.RunCommand\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/helper.go:51\ngithub.com
/kubevirt/web-ui-operator/pkg/controller/kwebui.loginClient\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:161\ngithub.com/kubevirt/web-ui-oper
ator/pkg/controller/kwebui.runPlaybookWithSetup\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:85\ngithub.com/kubevirt/web-ui-operator/pkg/cont
roller/kwebui.freshProvision\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:111\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.(*Re
concileKWebUI).Reconcile\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/kwebui_controller.go:111\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/con
troller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/contro
ller/controller.go:207\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/mlibra/go/src/github.com/kubevirt
/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.fun
c1\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wa
it.JitterUntil\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery
/pkg/util/wait.Until\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"
}
{
  "level": "error",
  "ts": 1561563231.1800418,
  "logger": "controller_kwebui",
  "caller": "kwebui/provision.go:163",
  "msg": "Failed to switch to the project. Trying to create it.",
  "Namespace": "kubevirt-web-ui",
  "error": "exit status 1",
  "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/za
pr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.loginClient\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:163\ngithu
b.com/kubevirt/web-ui-operator/pkg/controller/kwebui.runPlaybookWithSetup\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:85\ngithub.com/kubevir
t/web-ui-operator/pkg/controller/kwebui.freshProvision\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:111\ngithub.com/kubevirt/web-ui-operator/
pkg/controller/kwebui.(*ReconcileKWebUI).Reconcile\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/kwebui_controller.go:111\ngithub.com/kubevirt/web-ui-opera
tor/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-r
untime/pkg/internal/controller/controller.go:207\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/mlibra/
go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg
/util/wait.JitterUntil.func1\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.i
o/apimachinery/pkg/util/wait.JitterUntil\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubevirt/web-ui-operator/
vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"
}
{
  "level": "error",
  "ts": 1561563353.1235096,
  "logger": "controller_kwebui",
  "caller": "kwebui/helper.go:76",
  "msg": "Failed to remove file: ",
  "error": "remove : no such file or directory",
  "stacktrace": "github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/github.com/go-logr/za
pr/zapr.go:128\ngithub.com/kubevirt/web-ui-operator/pkg/controller/kwebui.RemoveFile\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/helper.go:76\ngithub.com
/kubevirt/web-ui-operator/pkg/controller/kwebui.runPlaybookWithSetup\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:98\ngithub.com/kubevirt/web
-ui-operator/pkg/controller/kwebui.freshProvision\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/provision.go:111\ngithub.com/kubevirt/web-ui-operator/pkg/c
ontroller/kwebui.(*ReconcileKWebUI).Reconcile\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/pkg/controller/kwebui/kwebui_controller.go:111\ngithub.com/kubevirt/web-ui-operator/v
endor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtim
e/pkg/internal/controller/controller.go:207\ngithub.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/home/mlibra/go/sr
c/github.com/kubevirt/web-ui-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util
/wait.JitterUntil.func1\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/kubevirt/web-ui-operator/vendor/k8s.io/api
machinery/pkg/util/wait.JitterUntil\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/kubevirt/web-ui-operator/vendo
r/k8s.io/apimachinery/pkg/util/wait.Until\n\t/home/mlibra/go/src/github.com/kubevirt/web-ui-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"
}

$ kubectl version                                                                                                                                                                           
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"archive", BuildDate:"2019-05-03T09:51:06Z", GoVe
rsion:"go1.12.2", Compiler:"gc", Platform:"linux/amd64"}                                                                                                                                      
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.0+85c4140", GitCommit:"85c4140", GitTreeState:"clean", BuildDate:"2019-06-26T02:49:06Z", GoVersion:"go1.12.5", Compiler
:"gc", Platform:"linux/amd64"}

Configuration in the HCO

We want to have the HCO handle configuration for all component operators. Provide a mechanism/library/interface that each component operator can use to expose their configuration.

Create a standard for how component operators expose config
- How should we default values?
- What should be exposed to the user?
Create some code library/mechanism to expose config
- Should configuration options be vendored from components?
- Should options be centralized?

HCO reconcile loop seems to be continuously running when it should run when an event occurs

HCO reconcile loop seems to be continuously running when it should run when an event occurs.

network addon pod - CrashLoopBackOff

I followed the instructions as given in the README

The pod creation in the network addon step fails

~/minikube/HCO_net_addon/hyperconverged-cluster-operator  root@shegde🎩minikube version
minikube version: v0.35.0

Getting the following error

~/minikube/HCO_net_addon/hyperconverged-cluster-operator  root@shegde🎩kubectl get all --namespace=kubevirt-hyperconverged
NAME                                                   READY   STATUS             RESTARTS   AGE
pod/cdi-apiserver-769fcc7bdf-dgjdc                     1/1     Running            0          6m14s
pod/cdi-deployment-8b64c5585-lgxgv                     1/1     Running            0          6m13s
pod/cdi-operator-c77447cc7-qdkt4                       1/1     Running            0          7m26s
pod/cdi-uploadproxy-8dcdcbff-nws8v                     1/1     Running            0          6m14s
pod/cluster-network-addons-operator-7ff5c6b47c-xmnjd   0/1     CrashLoopBackOff   6          7m26s
pod/hyperconverged-cluster-operator-75dd9c96f9-vv6rf   1/1     Running            0          7m26s
pod/kubevirt-hyperconverged-cluster-joblpxsh-s8b85     0/1     ErrImagePull       0          6m13s
pod/virt-operator-667b6c845d-prc6n                     1/1     Running            0          7m26s

NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/cdi-api           ClusterIP   10.107.87.97     <none>        443/TCP   6m14s
service/cdi-uploadproxy   ClusterIP   10.101.120.196   <none>        443/TCP   6m15s

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cdi-apiserver                     1/1     1            1           6m14s
deployment.apps/cdi-deployment                    1/1     1            1           6m14s
deployment.apps/cdi-operator                      1/1     1            1           7m26s
deployment.apps/cdi-uploadproxy                   1/1     1            1           6m14s
deployment.apps/cluster-network-addons-operator   0/1     1            0           7m26s
deployment.apps/hyperconverged-cluster-operator   1/1     1            1           7m26s
deployment.apps/virt-operator                     1/1     1            1           7m26s

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/cdi-apiserver-769fcc7bdf                     1         1         1       6m14s
replicaset.apps/cdi-deployment-8b64c5585                     1         1         1       6m14s
replicaset.apps/cdi-operator-c77447cc7                       1         1         1       7m26s
replicaset.apps/cdi-uploadproxy-8dcdcbff                     1         1         1       6m14s
replicaset.apps/cluster-network-addons-operator-7ff5c6b47c   1         1         0       7m26s
replicaset.apps/hyperconverged-cluster-operator-75dd9c96f9   1         1         1       7m26s
replicaset.apps/virt-operator-667b6c845d                     1         1         1       7m26s

NAME                                                 COMPLETIONS   DURATION   AGE
job.batch/kubevirt-hyperconverged-cluster-joblpxsh   0/1           6m13s      6m13s

Generate the NMO objects

The NMO manifest code is not vendored into the HCO.

Call out to the SSP code from manifest-templator:

https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/tools/manifest-templator/manifest-templator.go

Replace templates:

https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/templates/olm-catalog/kubevirt-hyperconverged/VERSION/kubevirt-hyperconverged-operator.VERSION.clusterserviceversion.yaml.in
https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/templates/cluster_role.yaml.in
https://github.com/kubevirt/hyperconverged-cluster-operator/blob/master/templates/operator.yaml.in

HCO fails on the first missing CRD

The HCO creates watchers on specific kubevirt APIs. In order to create those watchers, they need to exist so the current behavior in the HCO is to fail is the APIs don't exists. I think the HCO can be a little more dynamic -

create watchers for what's available
keep trying to create missing watchers
report in the logs and to OLM that an API is missing

Prow job is failing on kubevirt v0.20.4

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/kubevirt_hyperconverged-cluster-operator/274/pull-ci-kubevirt-hyperconverged-cluster-operator-master-hco-e2e-aws/267

 hyperconverged.hco.kubevirt.io/hyperconverged-cluster created
error: timed out waiting for the condition on pods/hyperconverged-cluster-operator-567748c945-625gj
apiVersion: v1
items:
- apiVersion: hco.kubevirt.io/v1alpha1
  kind: HyperConverged
  metadata:
    creationTimestamp: "2019-09-03T20:00:57Z"
    generation: 1
    name: hyperconverged-cluster
    namespace: kubevirt-hyperconverged
    resourceVersion: "22849"
    selfLink: /apis/hco.kubevirt.io/v1alpha1/namespaces/kubevirt-hyperconverged/hyperconvergeds/hyperconverged-cluster
    uid: 8b93009a-ce85-11e9-953b-1201d4d1a57c
  spec: {}
  status:
    conditions:
    - lastHeartbeatTime: "2019-09-03T20:07:28Z"
      lastTransitionTime: "2019-09-03T20:00:59Z"
      message: Reconcile completed successfully
      reason: ReconcileCompleted
      status: "True"
      type: ReconcileComplete
    - lastHeartbeatTime: "2019-09-03T20:07:28Z"
      lastTransitionTime: "2019-09-03T20:00:57Z"
      message: 'KubeVirt is not available: Deploying version v0.20.4 with registry
        kubevirt'
      reason: KubeVirtNotAvailable
      status: "False"
      type: Available
    - lastHeartbeatTime: "2019-09-03T20:07:28Z"
      lastTransitionTime: "2019-09-03T20:00:57Z"
      message: 'KubeVirt is progressing: Deploying version v0.20.4 with registry kubevirt'
      reason: KubeVirtProgressing
      status: "True"
      type: Progressing
    - lastHeartbeatTime: "2019-09-03T20:02:19Z"
      lastTransitionTime: "2019-09-03T20:01:01Z"
      message: 'CDI is degraded: '
      reason: CDIDegraded
      status: "True"
      type: Degraded
    - lastHeartbeatTime: "2019-09-03T20:07:28Z"
      lastTransitionTime: "2019-09-03T20:00:59Z"
      message: 'KubeVirt is progressing: Deploying version v0.20.4 with registry kubevirt'
      reason: KubeVirtProgressing
      status: "False"
      type: Upgradeable
    relatedObjects:
    - apiVersion: v1
      kind: ConfigMap
      name: kubevirt-config
      namespace: kubevirt-hyperconverged
      resourceVersion: "17025"
      uid: 8ba4a4fa-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: v1
      kind: ConfigMap
      name: kubevirt-config-storage-class-defaults
      namespace: kubevirt-hyperconverged
      resourceVersion: "17026"
      uid: 8baf3a54-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: kubevirt.io/v1alpha3
      kind: KubeVirt
      name: kubevirt-hyperconverged-cluster
      namespace: kubevirt-hyperconverged
      resourceVersion: "18712"
      uid: 8bb0143c-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: cdi.kubevirt.io/v1alpha1
      kind: CDI
      name: cdi-hyperconverged-cluster
      resourceVersion: "19809"
      uid: 8bb18ac9-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: networkaddonsoperator.network.kubevirt.io/v1alpha1
      kind: NetworkAddonsConfig
      name: cluster
      resourceVersion: "19841"
      uid: 8bb2c31a-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: kubevirt.io/v1
      kind: KubevirtCommonTemplatesBundle
      name: common-templates-hyperconverged-cluster
      namespace: openshift
      resourceVersion: "17037"
      uid: 8bb3ded0-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: kubevirt.io/v1
      kind: KubevirtNodeLabellerBundle
      name: node-labeller-hyperconverged-cluster
      namespace: kubevirt-hyperconverged
      resourceVersion: "22806"
      uid: 8bb6841b-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: kubevirt.io/v1
      kind: KubevirtTemplateValidator
      name: template-validator-hyperconverged-cluster
      namespace: kubevirt-hyperconverged
      resourceVersion: "22793"
      uid: 8bb7c1c3-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: kubevirt.io/v1
      kind: KubevirtMetricsAggregation
      name: metrics-aggregation-hyperconverged-cluster
      namespace: kubevirt-hyperconverged
      resourceVersion: "22271"
      uid: 8bb931df-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: machineremediation.kubevirt.io/v1alpha1
      kind: MachineRemediationOperator
      name: mro-hyperconverged-cluster
      namespace: kubevirt-hyperconverged
      resourceVersion: "19723"
      uid: 8c45b2e1-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: v1
      kind: ConfigMap
      name: v2v-vmware
      namespace: kubevirt-hyperconverged
      resourceVersion: "17147"
      uid: 8c4750ae-ce85-11e9-91f4-0a4673c489ea
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
NAME                                               READY   STATUS    RESTARTS   AGE
cdi-apiserver-5dcd644c88-q8tlp                     1/1     Running   0          6m30s
cdi-deployment-5588f8dffc-sdcvf                    1/1     Running   0          6m29s
cdi-operator-8bf7c7f8-swgd9                        1/1     Running   0          7m11s
cdi-uploadproxy-57db949d44-kf2fc                   1/1     Running   0          6m30s
cluster-network-addons-operator-58b94dc774-5kv2z   1/1     Running   0          7m11s
hyperconverged-cluster-operator-567748c945-625gj   0/1     Running   0          7m11s
kubevirt-node-labeller-2nfsg                       0/1     Pending   0          6m4s
kubevirt-node-labeller-q4g4n                       0/1     Pending   0          6m4s
kubevirt-node-labeller-thsc4                       0/1     Pending   0          6m4s
kubevirt-ssp-operator-6c898684c9-wb7qd             1/1     Running   0          7m11s
machine-remediation-operator-c6f7c7649-bbsmd       1/1     Running   0          7m11s
node-maintenance-operator-6c56547459-mvt65         1/1     Running   0          7m11s
virt-operator-8448785567-svbkp                     1/1     Running   0          7m11s
virt-operator-8448785567-zzs4z                     1/1     Running   0          7m11s
virt-template-validator-d9486995f-sm5wf            1/1     Running   0          6m4s
ReconcileComplete	True	Reconcile completed successfully
Available	False	KubeVirt is not available: Deploying version v0.20.4 with registry kubevirt
Progressing	True	KubeVirt is progressing: Deploying version v0.20.4 with registry kubevirt
Degraded	True	CDI is degraded: 
Upgradeable	False	KubeVirt is progressing: Deploying version v0.20.4 with registry kubevirt
make: *** [start] Error 1
2019/09/03 20:07:34 Container test in pod hco-e2e-aws failed, exit code 2, reason Error
2019/09/03 20:11:53 Copied 39.21Mi of artifacts from hco-e2e-aws to /logs/artifacts/hco-e2e-aws
2019/09/03 20:11:56 Ran for 47m33s
2019/09/03 20:11:56 Submitted failure event to sentry (id=9cb8696bfb194f49be90b7e8c59be561)
error: could not run steps: step hco-e2e-aws failed: template pod "hco-e2e-aws" failed: the pod ci-op-vn3l9m50/hco-e2e-aws failed after 42m9s (failed containers: test): ContainerFailed one or more containers exited
Container test exited with code 2, reason Error
---
uid: 8c45b2e1-ce85-11e9-91f4-0a4673c489ea
    - apiVersion: v1
      kind: ConfigMap
      name: v2v-vmware
      namespace: kubevirt-hyperconverged
      resourceVersion: "17147"
      uid: 8c4750ae-ce85-11e9-91f4-0a4673c489ea
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
NAME                                               READY   STATUS    RESTARTS   AGE
cdi-apiserver-5dcd644c88-q8tlp                     1/1     Running   0          6m30s
cdi-deployment-5588f8dffc-sdcvf                    1/1     Running   0          6m29s
cdi-operator-8bf7c7f8-swgd9                        1/1     Running   0          7m11s
cdi-uploadproxy-57db949d44-kf2fc                   1/1     Running   0          6m30s
cluster-network-addons-operator-58b94dc774-5kv2z   1/1     Running   0          7m11s
hyperconverged-cluster-operator-567748c945-625gj   0/1     Running   0          7m11s
kubevirt-node-labeller-2nfsg                       0/1     Pending   0          6m4s
kubevirt-node-labeller-q4g4n                       0/1     Pending   0          6m4s
kubevirt-node-labeller-thsc4                       0/1     Pending   0          6m4s
kubevirt-ssp-operator-6c898684c9-wb7qd             1/1     Running   0          7m11s
machine-remediation-operator-c6f7c7649-bbsmd       1/1     Running   0          7m11s
node-maintenance-operator-6c56547459-mvt65         1/1     Running   0          7m11s
virt-operator-8448785567-svbkp                     1/1     Running   0          7m11s
virt-operator-8448785567-zzs4z                     1/1     Running   0          7m11s
virt-template-validator-d9486995f-sm5wf            1/1     Running   0          6m4s
ReconcileComplete	True	Reconcile completed successfully
Available	False	KubeVirt is not available: Deploying version v0.20.4 with registry kubevirt
Progressing	True	KubeVirt is progressing: Deploying version v0.20.4 with registry kubevirt
Degraded	True	CDI is degraded: 
Upgradeable	False	KubeVirt is progressing: Deploying version v0.20.4 with registry kubevirt

Failing to Create the node labeller roles

Failing to Create the node labellers with HCO 0.18.1:

Server Version: version.Info{GitVersion:"v0.18.1", GitCommit:"4913335bc20764d0d0bec55da00146887726ae15", GitTreeState:"clean", BuildDate:"2019-06-13T10:22:49Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

The error is the log seems to be around:
"The task includes an option with an undefined variable. The error was: 'version' is undefined"

oc logs kubevirt-ssp-operator-5b5b48d875-rbssr
{"level":"info","ts":1562228173.0965724,"logger":"logging_event_handler","msg":"[playbook task]","name":"node-labeller-hyperconverged-cluster","namespace":"kubevirt-hyperconverged","gvk":"kubevirt.io/v1, Kind=KubevirtNodeLabellerBundle","event_type":"playbook_on_task_start","job":"1670934380564181240","EventData.Name":"KubevirtNodeLabeller : Create the node labeller roles"}
{"level":"error","ts":1562228173.1222782,"logger":"logging_event_handler","msg":"","name":"node-labeller-hyperconverged-cluster","namespace":"kubevirt-hyperconverged","gvk":"kubevirt.io/v1, Kind=KubevirtNodeLabellerBundle","event_type":"runner_on_failed","job":"1670934380564181240","EventData.Task":"Create the node labeller roles","EventData.TaskArgs":"","EventData.FailedTaskPath":"/opt/ansible/roles/KubevirtNodeLabeller/tasks/main.yml:2","error":"[playbook task failed]","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/events.loggingEventHandler.Handle\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/events/log_events.go:84"}
{"level":"error","ts":1562228173.3051622,"logger":"runner","msg":"\u001b[0;34mansible-playbook 2.7.10\u001b[0m\r\n\u001b[0;34m  config file = /etc/ansible/ansible.cfg\u001b[0m\r\n\u001b[0;34m  configured module search path = [u'/usr/share/ansible/openshift']\u001b[0m\r\n\u001b[0;34m  ansible python module location = /usr/lib/python2.7/site-packages/ansible\u001b[0m\r\n\u001b[0;34m  executable location = /usr/bin/ansible-playbook\u001b[0m\r\n\u001b[0;34m  python version = 2.7.5 (default, Oct 30 2018, 23:45:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]\u001b[0m\r\n\u001b[0;34mUsing /etc/ansible/ansible.cfg as config file\u001b[0m\r\n\n\u001b[0;34m/tmp/ansible-operator/runner/kubevirt.io/v1/KubevirtNodeLabellerBundle/kubevirt-hyperconverged/node-labeller-hyperconverged-cluster/inventory/hosts did not meet host_list requirements, check plugin documentation if this is unexpected\u001b[0m\r\n\n\u001b[0;34m/tmp/ansible-operator/runner/kubevirt.io/v1/KubevirtNodeLabellerBundle/kubevirt-hyperconverged/node-labeller-hyperconverged-cluster/inventory/hosts did not meet script requirements, check plugin documentation if this is unexpected\u001b[0m\r\n\n\u001b[0;34m/tmp/ansible-operator/runner/kubevirt.io/v1/KubevirtNodeLabellerBundle/kubevirt-hyperconverged/node-labeller-hyperconverged-cluster/inventory/hosts did not meet script requirements, check plugin documentation if this is unexpected\u001b[0m\n\r\nPLAYBOOK: kubevirtnodelabeller.yaml ********************************************\n\u001b[0;34m1 plays in /opt/ansible/kubevirtnodelabeller.yaml\u001b[0m\n\r\nPLAY [localhost] ***************************************************************\n\u001b[0;34mMETA: ran handlers\u001b[0m\n\r\nTASK [KubevirtNodeLabeller : Create the node labeller roles] *******************\r\n\u001b[1;30mtask path: /opt/ansible/roles/KubevirtNodeLabeller/tasks/main.yml:2\u001b[0m\n\u001b[0;31mfatal: [localhost]: FAILED! => {\"msg\": \"The task includes an option with an undefined variable. The error was: 'version' is undefined\\n\\nThe error appears to have been in '/opt/ansible/roles/KubevirtNodeLabeller/tasks/main.yml': line 2, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n---\\n- name: Create the node labeller roles\\n  ^ here\\n\"}\u001b[0m\n\r\nPLAY RECAP *********************************************************************\r\n\u001b[0;31mlocalhost\u001b[0m                  : ok=0    changed=0    unreachable=0    \u001b[0;31mfailed=1   \u001b[0m\r\n\n","job":"1670934380564181240","name":"node-labeller-hyperconverged-cluster","namespace":"kubevirt-hyperconverged","error":"exit status 2","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:289"}

'make bundleRegistry' fails

make bundleRegistry' fails with:

time="2019-08-23T16:21:19Z" level=fatal msg="two csvs found in one bundle"
The command '/bin/sh -c initializer --manifests /registry --output bundles.db' returned a non-zero code: 1

rm ./deploy/olm-catalog/kubevirt-hyperconverged/0.0.2/kubevirt-hyperconverged-operator.v0.0.2.clusterserviceversion_merger.yaml is a temporary workaorund

VM creation using yaml or web-ui fails with `service: "fake-validation-service" not found`

Installed latest HCO on OpenShift 4.2 master. Using the vmi.yaml from kubevirt docs I tried to create a virtual machine instance:

$ kubectl create -f vmi.yaml 
Error from server (InternalError): error when creating "vmi.yaml": Internal error occurred: failed calling webhook "virtualmachineinstances.kubevirt.io-tmp-validator": Post https://fake-validation-service.kubevirt-hyperconverged.svc:443/fake-path/virtualmachineinstances.kubevirt.io?timeout=30s: service "fake-validation-service" not found

Later on created a VM template from web-ui, tried to create a VMI using the template and got this error from ui:

Error "failed calling webhook "virtualmachines.kubevirt.io-tmp-validator": Post https://fake-validation-service.kubevirt-hyperconverged.svc:443/fake-path/virtualmachines.kubevirt.io?timeout=30s: service "fake-validation-service" not found" for field "undefined".

The trace collection seems to be broken

We had a failing test run on a prow job. Aparentky containers did not become ready. In the logic which tries to collect information about the cluster, an error occured:

Found pods with errors cdi-apiservercdi-deploymentcdi-uploadproxyvirt-apivirt-controllerhyperconverged-cluster-operator
------------- cdi-apiservercdi-deploymentcdi-uploadproxyvirt-apivirt-controllerhyperconverged-cluster-operator
error: expected 'logs [-f] [-p] (POD | TYPE/NAME) [-c CONTAINER]'.
POD or TYPE/NAME is a required argument for the logs command
See 'oc logs -h' for help and examples
make: *** [start] Error 1

Make any openshift specfic items configurable so they work in k8s

#53

HCO pod in CrashLoopBackOff : Error:'no matches for kind "KubevirtMetricsAggregation" in version kubevirt.io/v1'

I am facing issue while deploying HCO,



NAME                                               READY   STATUS             RESTARTS   AGE
cdi-operator-6b5c95f948-cct6d                      1/1     Running            0          62m
cluster-network-addons-operator-5b565f55c6-8vrn9   1/1     Running            0          62m
hyperconverged-cluster-operator-6f6c5f4486-hdkwd   0/1     CrashLoopBackOff   20         62m
kubevirt-ssp-operator-846dbc9f64-t2z6r             0/1     CrashLoopBackOff   20         62m
node-maintenance-operator-6b6d5756d8-hcx4w         1/1     Running            0          62m
virt-operator-74f6679d9f-jg7vr                     1/1     Running            0          62m
virt-operator-74f6679d9f-tttz7                     1/1     Running            0          62m


 kubevirt-hyperconverged logs  hyperconverged-cluster-operator-6f6c5f4486-hdkwd

{"level":"info","ts":1565593961.9313908,"logger":"cmd","msg":"Go Version: go1.11.10"}
{"level":"info","ts":1565593961.9314265,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1565593961.9314318,"logger":"cmd","msg":"Version of operator-sdk: v0.9.0+git"}
{"level":"info","ts":1565593961.931928,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1565593962.0958188,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."}
{"level":"info","ts":1565593962.095874,"logger":"leader","msg":"Continuing as the leader."}
{"level":"info","ts":1565593962.239434,"logger":"cmd","msg":"Registering Components."}
{"level":"info","ts":1565593962.239908,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.2400892,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.2401779,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.2402496,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.2403219,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.240397,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.2404702,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1565593962.2405872,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"hyperconverged-controller","source":"kind source: /, Kind="}
{"level":"error","ts":1565593962.3880389,"logger":"kubebuilder.source","msg":"if kind is a CRD, it should be installed before calling Start","kind":"KubevirtMetricsAggregation.kubevirt.io","error":"no matches for kind \"KubevirtMetricsAggregation\" in version \"kubevirt.io/v1\"","stacktrace":"github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/source.go:89\ngithub.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Watch\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:122\ngithub.com/kubevirt/hyperconverged-cluster-operator/pkg/controller/hyperconverged.add\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/pkg/controller/hyperconverged/hyperconverged_controller.go:93\ngithub.com/kubevirt/hyperconverged-cluster-operator/pkg/controller/hyperconverged.Add\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/pkg/controller/hyperconverged/hyperconverged_controller.go:61\ngithub.com/kubevirt/hyperconverged-cluster-operator/pkg/controller.AddToManager\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/pkg/controller/controller.go:13\nmain.main\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/cmd/hyperconverged-cluster-operator/main.go:121\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}
{"level":"error","ts":1565593962.3881633,"logger":"cmd","msg":"","error":"no matches for kind \"KubevirtMetricsAggregation\" in version \"kubevirt.io/v1\"","stacktrace":"github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nmain.main\n\t/go/src/github.com/kubevirt/hyperconverged-cluster-operator/cmd/hyperconverged-cluster-operator/main.go:122\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}

KubeVirt Web UI Operator fails to deploy

Just deployed HCO from master and the kubevirt-web-ui-operator failed, my kubernetes cluster was trying to pull the image from (docker.io/)kubevirt/kubevirt-web-ui-operator which doesn't seem to exist.

Modifying the deployment object by adding quay.io before kubevirt/kubevirt-web-ui-operator:latest fixed the issue for me.

Update marketplace workflow from `hack/quay-registry.sh`

Marketplace is simplifying the CatalogSource creation. The details are outlined in this thread: https://coreos.slack.com/archives/C3VS0LV41/p1560534742314600

Misspelled KubeVirt in Web UI

It should be one word, not two.

deploy/converged/operator.yaml fails because "node-maintenance-operator" namespace is not found

What should be creating the "node-maintenace-operator" namespace?

hyperconverged-cluster-operator rwsu$ kubectl create -f deploy/converged
clusterrole.rbac.authorization.k8s.io/hyperconverged-cluster-operator created
clusterrole.rbac.authorization.k8s.io/kubevirt-operator created
clusterrole.rbac.authorization.k8s.io/cdi-operator created
clusterrole.rbac.authorization.k8s.io/cluster-network-addons-operator created
clusterrolebinding.rbac.authorization.k8s.io/hyperconverged-cluster-operator created
clusterrolebinding.rbac.authorization.k8s.io/kubevirt-operator created
clusterrolebinding.rbac.authorization.k8s.io/cdi-operator created
clusterrolebinding.rbac.authorization.k8s.io/cdi-operator-admin created
clusterrolebinding.rbac.authorization.k8s.io/cluster-network-addons-operator created
clusterrolebinding.rbac.authorization.k8s.io/cluster-network-addons-operator-admin created
clusterrolebinding.rbac.authorization.k8s.io/kubevirt-ssp-operator created
clusterrolebinding.rbac.authorization.k8s.io/kubevirt-web-ui-operator created
clusterrolebinding.rbac.authorization.k8s.io/node-maintenance-operator created
deployment.apps/hyperconverged-cluster-operator created
deployment.apps/virt-operator created
deployment.apps/cdi-operator created
deployment.apps/cluster-network-addons-operator created
deployment.apps/kubevirt-ssp-operator created
deployment.apps/kubevirt-web-ui-operator created
serviceaccount/hyperconverged-cluster-operator created
serviceaccount/cdi-operator created
serviceaccount/kubevirt-operator created
serviceaccount/cluster-network-addons-operator created
serviceaccount/kubevirt-ssp-operator created
serviceaccount/kubevirt-web-ui-operator created
serviceaccount/node-maintenance-operator created
Error from server (NotFound): error when creating "deploy/converged/operator.yaml": namespaces "node-maintenance-operator" not found
error validating "deploy/converged/cluster_role.yaml": error validating data: ValidationError(ClusterRole.rules[9]): unknown field "resourceName" in io.k8s.api.rbac.v1.PolicyRule; if you choose to ignore these errors, turn validation off with --validate=false

Log spammed with "odd number of arguments passed as key-value pairs for logging"

HCO log is spammed with "dpanic":

{
  "level": "dpanic",
  "ts": 1554220549.913,
  "logger": "controller_hyperconverged",
  "msg": "odd number of arguments passed as key-value pairs for logging",
  "ignored key": "KubeVirt",
  "stacktrace": "github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr.handleFields\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr\/zapr.go:106\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr.(*infoLogger).Info\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr\/zapr.go:70\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged.manageComponentCR\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged\/hyperconverged_controller.go:212\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged.(*ReconcileHyperConverged).Reconcile\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged\/hyperconverged_controller.go:150\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller.(*Controller).processNextWorkItem\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller\/controller.go:215\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller.(*Controller).Start.func1\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller\/controller.go:158\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait.JitterUntil.func1\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait\/wait.go:133\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait.JitterUntil\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait\/wait.go:134\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait.Until\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait\/wait.go:88"
}
{
  "level": "dpanic",
  "ts": 1554220549.9132,
  "logger": "controller_hyperconverged",
  "msg": "odd number of arguments passed as key-value pairs for logging",
  "ignored key": "CDI",
  "stacktrace": "github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr.handleFields\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr\/zapr.go:106\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr.(*infoLogger).Info\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/github.com\/go-logr\/zapr\/zapr.go:70\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged.manageComponentCR\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged\/hyperconverged_controller.go:199\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged.(*ReconcileHyperConverged).Reconcile\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/pkg\/controller\/hyperconverged\/hyperconverged_controller.go:169\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller.(*Controller).processNextWorkItem\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller\/controller.go:215\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller.(*Controller).Start.func1\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/sigs.k8s.io\/controller-runtime\/pkg\/internal\/controller\/controller.go:158\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait.JitterUntil.func1\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait\/wait.go:133\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait.JitterUntil\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait\/wait.go:134\ngithub.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait.Until\n\t\/home\/rhallisey\/src\/github.com\/kubevirt\/hyperconverged-cluster-operator\/vendor\/k8s.io\/apimachinery\/pkg\/util\/wait\/wait.go:88"
}

Automate builds of U/S images

We can use Travis or Quay to do automatic builds of the operator and registry. Would be handy especially for master. I'd be happy to assist.

HCO doesn't use a PHASE

oc get hco --all-namespaces constantly displays nothing for a PHASE field.

Phases:

installing: creating component CRs
reconciling: install has already run. Recreating missing CRs
running: HCO is happy
failed: HCO is reporting error

Example from virt-operator
https://github.com/kubevirt/kubevirt/blob/master/pkg/virt-operator/kubevirt.go#L808

virt-operator not deploying KubeVirt

After working around #27 , I found that the virt-operator doesn't seem to be doing its job. Looking at the logs, seems related to permissions:

...
{"component":"virt-operator","level":"info","msg":"reenqueuing KubeVirt kubevirt-hyperconverged/kubevirt-hyperconverged-cluster","pos":"kubevirt.go:405","reason":"unable to create clusterrole \u0026ClusterRole{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:kubevirt-controller,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:\u003cnil\u003e,DeletionGracePeriodSeconds:nil,Labels:map[string]string{app.kubernetes.io/managed-by: kubevirt-operator,kubevirt.io: ,},Annotations:map[string]string{kubevirt.io/install-strategy-registry: kubevirt,kubevirt.io/install-strategy-version: latest,},OwnerReferences:[],Finalizers:[],ClusterName:,Initializers:nil,},Rules:[{[get list watch delete create] [policy] [poddisruptionbudgets] [] []} {[get list watch delete update create] [] [pods configmaps endpoints] [] []} {[update create patch] [] [events] [] []} {[update] [] [pods/finalizers] [] []} {[get list watch update patch] [] [nodes] [] []} {[get list watch] [] [persistentvolumeclaims] [] []} {[*] [kubevirt.io] [*] [] []} {[*] [cdi.kubevirt.io] [*] [] []} {[get list watch] [k8s.cni.cncf.io] [network-attachment-definitions] [] []}],AggregationRule:nil,}: clusterroles.rbac.authorization.k8s.io \"kubevirt-controller\" is forbidden: user \"system:serviceaccount:kubevirt-hyperconverged:kubevirt-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:kubevirt-hyperconverged\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"policy\"], Resources:[\"poddisruptionbudgets\"], Verbs:[\"get\" \"list\" \"watch\" \"delete\" \"create\"]}","timestamp":"2019-04-16T07:32:53.251537Z"}
...

I can provide more logs if needed, after applying the following workaround, it seemed to work:

kubectl create clusterrolebinding hco-workaround --clusterrole=cluster-admin --serviceaccount=kubevirt-hyperconverged:kubevirt-operator

Which might be a bit too much but allows the installation to finish.

Update CDI to v1.9.0

CDI v1.9.0 was released yesterday. Without it, uploading to a PVC will not work on OCP 4. Relevant bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1705080

When the operator and application is tied, use `recreate` strategy

All the component operators under the HCO should tie together the operator and application so that OLM has full control of the update process (until something more robust is needed). By default, deployments use the RollingUpdate strategy, which will cause new pods to be created while the old one is being deleted. This can cause confusion in the operator because there's potential for two operators managing the application for a brief period of time.

Explicitly set strategy to recreate to guarantee the old operator is removed before the new one starts.

@davidvossel's fix on the virt-operator kubevirt/kubevirt#2195

spec:
	  replicas: 1
	  selector:
	    matchLabels:
	      kubevirt.io: virt-operator
	  strategy:
	    type: Recreate

Inform OLM we are not ready for upgrades until a Reconcile has passed

OLM gives uses the Readiness probe to determine if it us able to push upgrade. PR #212, makes the HCO Ready before a Readiness has run because we're limited on when a Readiness probe can be set.

When OLM comes up with a new way to block upgrade, migrate away from the Readiness probe to that method.

kubevirt / hyperconverged-cluster-operator Goto Github PK