liqotech / liqo Goto Github PK
View Code? Open in Web Editor NEWEnable dynamic and seamless Kubernetes multi-cluster topologies
Home Page: https://liqo.io
License: Apache License 2.0
Enable dynamic and seamless Kubernetes multi-cluster topologies
Home Page: https://liqo.io
License: Apache License 2.0
Is your feature request related to a problem? Please describe.
Improve documentation for Liqo Users:
Is your feature request related to a problem? Please describe.
We need a Release Pipeline to build docker images and agent artifacts to release the first Liqo version.
foreign cluster
is discovered, a new broadcaster deployment for that cluster should be started.Describe the bug
The virtual node ownerReference doesn't exist anymore: it points to liqo-<clusterID>
deployment, which now is named virtual-kubelet-<clusterID>
.
Describe the bug
In the context of DNS discovery, when an unidirectional peering has already been established, the creation of a SearchDomain
for the opposite direction does not trigger the autojoin process (although autojoin=true). Instead, it is necessary to manually set the join property in the corresponding foreigncluster
resource (which is otherwise set to false).
To Reproduce
Steps to reproduce the behavior:
SearchDomain
resource in A pointing to B (with autojoin=true) and observe that the virtual kubelet is correctly createdSearchDomain
resource in B pointing to A (with autojoin=true) and observe that the virtual kubelet is NOT createdforeigncluster
resource in B and observe that the join property is set to falseExpected behavior
Both clusters should correctly perform the autojoin
Describe the bug
When the two peering hosts have at least one overlapping IP, mDNS packets are received but ForeignClusters are not created
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A new ForeignCluster should appear
Describe the bug
When the advertisement of a cluster expires, its reference and the reference for the virtualKubelet identity is not correctly purged from the foreign cluster (FG). The FG still continues to reference it.
apiVersion: discovery.liqo.io/v1alpha1
kind: ForeignCluster
metadata:
creationTimestamp: "2020-09-19T09:52:09Z"
finalizers:
- foreigncluster.discovery.liqo.io/peered
generation: 7
labels:
cluster-id: 9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
discovery-type: WAN
managedFields:
- apiVersion: discovery.liqo.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:outgoing:
f:advertisement:
.: {}
f:apiVersion: {}
f:kind: {}
f:name: {}
f:uid: {}
manager: advertisement-operator
operation: Update
time: "2020-09-19T09:52:15Z"
- apiVersion: discovery.liqo.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.: {}
v:"foreigncluster.discovery.liqo.io/peered": {}
f:labels:
.: {}
f:cluster-id: {}
f:discovery-type: {}
f:ownerReferences:
.: {}
k:{"uid":"61f7e7ce-f1b2-44ef-bdf9-32f43b08f068"}:
.: {}
f:apiVersion: {}
f:kind: {}
f:name: {}
f:uid: {}
f:spec:
.: {}
f:allowUntrustedCA: {}
f:apiUrl: {}
f:clusterID: {}
f:discoveryType: {}
f:join: {}
f:namespace: {}
f:status:
.: {}
f:incoming:
.: {}
f:joined: {}
f:outgoing:
.: {}
f:advertisementStatus: {}
f:identityRef:
.: {}
f:apiVersion: {}
f:kind: {}
f:name: {}
f:namespace: {}
f:joined: {}
f:remote-peering-request-name: {}
manager: discovery
operation: Update
time: "2020-09-21T08:46:36Z"
name: 9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
ownerReferences:
- apiVersion: discovery.liqo.io/v1alpha1
kind: SearchDomain
name: x.y.z
uid: 61f7e7ce-f1b2-44ef-bdf9-32f43b08f068
resourceVersion: "307307896"
selfLink: /apis/discovery.liqo.io/v1alpha1/foreignclusters/9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
uid: 58eb5e76-d33f-42f3-853e-4cb4dd78485e
spec:
allowUntrustedCA: false
apiUrl: https://apiserver.crownlabs.polito.it.:443
clusterID: 9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
discoveryType: WAN
join: true
namespace: liqo
status:
incoming:
joined: false
outgoing:
advertisement:
apiVersion: sharing.liqo.io/v1alpha1
kind: Advertisement
name: advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
uid: a1f20c37-c97f-4043-acf3-fac893d85b8a
advertisementStatus: Accepted
identityRef:
apiVersion: v1
kind: Secret
name: vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
namespace: liqo
joined: true
remote-peering-request-name: 09f709f0-96bd-48ad-9e1a-8efb22bed89e```
kubectl get secret -n liqo
NAME TYPE DATA AGE
9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-token-mzgkn kubernetes.io/service-account-token 3 46h
advertisement-operator-token-kk28r kubernetes.io/service-account-token 3 47h
broadcaster-token-qj7c7 kubernetes.io/service-account-token 3 47h
ca-data Opaque 1 47h
crdreplicator-operator-service-account-token-6q4x7 kubernetes.io/service-account-token 3 47h
dashboard-cert kubernetes.io/tls 2 47h
default-token-flgmz kubernetes.io/service-account-token 3 47h
discovery-sa-token-t4vd4 kubernetes.io/service-account-token 3 47h
liqodash-admin-sa-token-zxtjn kubernetes.io/service-account-token 3 47h
peering-request-operator-token-7882f kubernetes.io/service-account-token 3 47h
peering-request-webhook-certs Opaque 2 47h
pod-mutator-secret Opaque 2 47h
podmutatoraccount-token-zrr8c kubernetes.io/service-account-token 3 47h
route-operator-service-account-token-rsk7s kubernetes.io/service-account-token 3 47h
sh.helm.release.v1.liqo.v1 helm.sh/release.v1 1 47h
sn-operator-token-262z6 kubernetes.io/service-account-token 3 47h
tunnel-operator-service-account-token-wj9fg kubernetes.io/service-account-token 3 47h
tunnelendpointcreator-operator-service-account-token-kslrp kubernetes.io/service-account-token 3 47h
kubectl get advertisements.sharing.liqo.io
No resources found in default namespace.
Is your feature request related to a problem? Please describe.
Advertisement deletion is handled via the setting of field AdvertisementStatus = Deleting
and do not benefit from Kubernetes garbage collection mechanisms.
Describe the solution you'd like
The solution will rely on Finalizers, adopted in the context of Advertisement resource.
Describe the bug
Sometimes, running liqo in kind, discovery component advertises on LAN wrong ip. For example if the correct IP is 172.18.0.x it advertises 192.168.200.x
To Reproduce
Run liqo installer in kind cluster
Describe the bug
When deploying a pod/service in default namespace, the virtual kubelet keeps failing when trying to reflect the kubernetes service.
To Reproduce
Steps to reproduce the behavior:
liqo.io/enabled=true
Expected behavior
Virtual Kubelet should not crash
Desktop (please complete the following information):
Additional context
Running on two peered K3s.
Is your feature request related to a problem? Please describe.
When executing the popeye scanner against the namepace where the liqo resources are installed, different warnings are raised. It may be worth to address (at least some of) them.
Additional context
Excerpt of the report
PODS (14 SCANNED) ๐ฅ 1 ๐ฑ 13 ๐ 0 โ
0 0ูช
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
ยท liqo/advertisement-operator-66f7c948c6-g5sq8...................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ advertisement-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/crdreplicator-operator-5d577fc976-jpsgm...................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ crdreplicator-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/discovery-6c99c89fbc-2bgcr................................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ discovery
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/liqo-dashboard-7977f68bc4-ml4sg...........................................................๐ฅ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-300] Using "default" ServiceAccount.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ liqo-dashboard
๐ฑ [POP-101] Image tagged "latest" in use.
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
๐ณ proxy-cert
๐ฅ [POP-100] Untagged docker image in use.
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/peering-request-operator-587f86fdd4-96mzv.................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ peering-request-deployment
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
๐ณ peering-request-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ [POP-108] Unnamed port 8443.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
๐ณ secret-creation
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/podmutator-7986cd56dc-4znsd...............................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ pod-mutator-deployment
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
๐ณ podmutator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
๐ณ secret-creation
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/route-operator-66dkv......................................................................๐ฑ
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ route-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/route-operator-7mdnf......................................................................๐ฑ
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ route-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/route-operator-88k29......................................................................๐ฑ
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ route-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/route-operator-jjwmw......................................................................๐ฑ
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ route-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/route-operator-ldt9p......................................................................๐ฑ
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ route-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/schedulingnode-operator-7cf6db4b78-ppvbx..................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ schedulingnode-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/tunnel-operator-5795c49f79-z4v56..........................................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ tunnel-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
ยท liqo/tunnelendpointcreator-operator-698cf97957-cc5gj...........................................๐ฑ
๐ [POP-206] No PodDisruptionBudget defined.
๐ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
๐ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
๐ณ tunnelendpointcreator-operator
๐ฑ [POP-106] No resources requests/limits defined.
๐ฑ [POP-102] No probes defined.
๐ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
SERVICES (3 SCANNED) ๐ฅ 1 ๐ฑ 0 ๐ 2 โ
0 66ูช
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
โ
ยท liqo/liqo-dashboard............................................................................๐ฅ
๐ฅ [POP-1106] No target ports match service port TCP:https:443.
๐ [POP-1104] Do you mean it? Type NodePort detected.
ยท liqo/mutatepodtoleration.......................................................................๐
๐ [POP-1101] Skip ports check. No explicit ports detected on pod
liqo/podmutator-7986cd56dc-4znsd.
ยท liqo/peering-request-operator..................................................................๐
๐ [POP-1102] Use of target port%!(EXTRA string=8443, string=TCP::8443).
Describe the bug
So far, the version of the node created by the virtual kubelet is unaligned with the rest of the cluster. This is causing issues upgrading the cluster with Kubeadm by failing some integrity checks.
Expected Behavior
Two possibilities:
Describe the bug
It is impossible to set the Allow Untrusted CA
parameter of a ForeignCluster
to true when the DNS discovery is leveraged (i.e. the ForeignCluster
is generated by a SearchDomain
). Whenever the value is manually set to true
editing the resource, it is reset to false by the operator a few seconds later.
To Reproduce
Steps to reproduce the behavior:
SearchDomain
resource to trigger the DNS discovery processForeignCluster
and set Allow Untrusted CA
to trueExpected behavior
The user input should not be discarded.
Additional context
This is the log of the discovery operator from the manual editing to the reset to the original value
I0911 14:48:14.632029 1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:14.642819 1 foreign-cluster-controller.go:206] Get CA Data
I0911 14:48:14.695077 1 foreign-cluster-controller.go:223] CA Data successfully loaded for ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:14.695138 1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:14.719931 1 foreign-cluster-controller.go:308] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e successfully reconciled
I0911 14:48:23.178066 1 search-domain-controller.go:26] Reconciling SearchDomain ***.***.it
I0911 14:48:23.209959 1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:23.210210 1 foreign.go:53] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e updated
I0911 14:48:23.219507 1 search-domain-controller.go:113] SearchDomain ***.***.it successfully reconciled
I0911 14:48:23.241776 1 foreign-cluster-controller.go:308] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e successfully reconciled
I0911 14:48:37.716436 1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:37.744811 1 foreign-cluster-controller.go:308] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e successfully reconciled
Is your feature request related to a problem? Please describe.
So far Liqo connectivity does not have E2E test. We should implement those test to avoid regressions on new versions.
Describe the bug
In case of network failure, when advertisement expires, the foreign identity is not set and verified by the broadcaster. The broadcaster just create the new advertisement but does not update the virtual-kubelet identity to be present.
From the log of advertisement operator:
I0921 08:42:15.683278 1 controller.go:330] Adv advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 expired. TimeToLive was 2020-09-21 08:33:38 +0000 UTC
I0921 08:42:15.752570 1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:43:15.752861 1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:44:15.753263 1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:45:15.753614 1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:45:48.685724 1 controller.go:311] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 accepted
E0921 08:45:48.874187 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:49.884634 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:50.894234 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:51.910722 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:52.920457 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:53.937163 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:54.946737 1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
Describe the bug
When the broadcaster tries to recreate the Secret
for the VirtualKubelet, an error occurs due to ResouceVersion
field being set.
E0925 10:11:22.708028 1 broadcaster.go:344] Unable to create secret vk-kubeconfig-secret-433f86df-2734-49e4-9dd7-9f6fd364b88f on remote cluster 23ca2126-3b7c-4a35-8186-78c4c76e5f36; error: resourceVersion should not be set on objects to be created
E0925 10:11:22.708051 1 broadcaster.go:176] resourceVersion should not be set on objects to be created Error while sending Secret for virtual-kubelet to cluster 23ca2126-3b7c-4a35-8186-78c4c76e5f36
To Reproduce
Advertisement
on cluster1 (or directly delete the secret vk-kubeconfig-secret-<clusterID>
kubectl logs -n liqo broadcaster-<clusterID>
Liqo enables resource sharing across Kubernetes clusters. To do so, it encapsulates (1) a logic to discover/advertise resources in a neighborhood (e.g. LAN) and (2) a protocol to negotiate resource exchange. In this document, we describe how the cluster peering logic works.
Sharing resources with Liqo relies on three different phases:
This issue describes how two clusters discover each other and start sharing resources.
The discovery service exploits DNS ServiceDiscovery protocol, which works both on a LAN and WAN scenarios. In first case with mDNS, in second the one with standard DNS.
Resource sharing is based on periodic Advertisement exchanges, where each cluster exposes its capabilities, allowing others to use them to offload their jobs.
Discovery service allows two clusters to know each other, ask for resources and begin exchanging Advertisements.
The protocol is described by the following steps:
FederationRequest
resourcesForeignCluster
CR along with their clusterID
Federate
flag in the ForeignCluster
CR becomes true (either automatically or manually), an operator is triggered and uses the stored kubeconfig to create a new FederationRequest
CR on the foreign cluster. FederationRequest
creation process includes the creation of new kubeconfig with management permission on Advertisement
CRsFederationRequest
sFederationRequest
is used to start the sharing of resourcesThe Advertisement operator can be split in two main components.
The broadcaster is in charge of sending to other clusters the Advertisement CR, containing the resources made available for sharing and (optionally) their prices. It reads from a ConfigMap the foreign cluster kubeconfig, which allows it to manage the Advertisement.
After creating it, a remote watcher is started, which is a goroutine that watches the Advertisement Status
on the remote cluster. This way, the home cluster can know if its CR has been accepted by the foreign cluster and if the podCIDR
has been remapped by the network module.
The controller is the module that receives Advertisement CRs and creates the virtual nodes with the announced resources. Doing so, the remote clusters (emulated by the virtual nodes) are taken into account by the scheduler, which can offload the jobs it receives on them.
Is your feature request related to a problem? Please describe.
Liqo Dashboard should be included among the package installed
Describe the solution you'd like
The Liqo dashboard should be installed by default, with an opt-out flag
Describe alternatives you've considered
Have a separate installer could enhance complexity.
Items
Dashboard:
Describe the bug
Once created two k3s clusters from scratch and installed in both liqo, the "liqo-<...>" pod in the second cluster (in order of liqo installation) is stuck on the "Init:0/1" status.
To Reproduce
Steps to reproduce the behavior:
kubectl describe pod -n liqo liqo-<...>
in the second cluster;Name: liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc-gt52d
Namespace: liqo
Priority: 0
Node: rar-k3s-01/10.0.2.4
Start Time: Tue, 01 Sep 2020 17:40:22 +0200
Labels: app=virtual-kubelet
cluster=8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
pod-template-hash=68f74654dc
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc
Init Containers:
crt-generator:
Container ID:
Image: liqo/init-vkubelet:latest
Image ID:
Port: <none>
Host Port: <none>
Command:
/usr/bin/local/kubelet-setup.sh
Args:
/etc/virtual-kubelet/certs
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
POD_IP: (v1:status.podIP)
POD_NAME: liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc-gt52d (v1:metadata.name)
Mounts:
/etc/virtual-kubelet/certs from virtual-kubelet-crt (rw)
/var/run/secrets/kubernetes.io/serviceaccount from liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq (ro)
Containers:
virtual-kubelet:
Container ID:
Image: liqo/virtual-kubelet:latest
Image ID:
Port: <none>
Host Port: <none>
Command:
/usr/bin/virtual-kubelet
Args:
--cluster-id
8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
--provider
kubernetes
--nodename
liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
--kubelet-namespace
liqo
--provider-config
/app/kubeconfig/remote
--home-cluster-id
d9df783b-cd9b-4d25-ae37-231e21dc9739
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
APISERVER_CERT_LOCATION: /etc/virtual-kubelet/certs/server.crt
APISERVER_KEY_LOCATION: /etc/virtual-kubelet/certs/server-key.pem
VKUBELET_POD_IP: (v1:status.podIP)
VKUBELET_TAINT_KEY: virtual-node.liqo.io/not-allowed
VKUBELET_TAINT_VALUE: true
VKUBELET_TAINT_EFFECT: NoExecute
Mounts:
/app/kubeconfig/remote from remote-kubeconfig (rw,path="kubeconfig")
/etc/virtual-kubelet/certs from virtual-kubelet-crt (rw)
/var/run/secrets/kubernetes.io/serviceaccount from liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
remote-kubeconfig:
Type: Secret (a volume populated by a Secret)
SecretName: vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
Optional: false
virtual-kubelet-crt:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq:
Type: Secret (a volume populated by a Secret)
SecretName: liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned liqo/liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc-gt52d to rar-k3s-01
Warning FailedMount 37m kubelet, rar-k3s-01 Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq remote-kubeconfig virtual-kubelet-crt]: timed out waiting for the condition
Warning FailedMount 35m (x10 over 39m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "remote-kubeconfig" : secret "vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef" not found
Warning FailedMount 35m kubelet, rar-k3s-01 Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[virtual-kubelet-crt liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq remote-kubeconfig]: timed out waiting for the condition
Warning FailedMount 34m (x2 over 34m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "remote-kubeconfig" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 34m (x2 over 34m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 34m (x4 over 34m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "remote-kubeconfig" : secret "vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef" not found
Warning FailedMount 33m (x2 over 33m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "remote-kubeconfig" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 33m (x2 over 33m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 18m (x6 over 31m) kubelet, rar-k3s-01 Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[virtual-kubelet-crt liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq remote-kubeconfig]: timed out waiting for the condition
Warning FailedMount 13m (x3 over 25m) kubelet, rar-k3s-01 Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[remote-kubeconfig virtual-kubelet-crt liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq]: timed out waiting for the condition
Warning FailedMount 3m14s (x21 over 33m) kubelet, rar-k3s-01 MountVolume.SetUp failed for volume "remote-kubeconfig" : secret "vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef" not found
Also, there is no liqo-<...> node in that cluster.
Expected behavior
The pod should be in "Running" status and there should be a liqo-<...> node in that cluster.
Additional context
kubectl describe foreignclusters.discovery.liqo.io
Name: 8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
Namespace:
Labels: cluster-id=8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
discovery-type=LAN
Annotations: <none>
API Version: discovery.liqo.io/v1alpha1
Kind: ForeignCluster
Metadata:
Creation Timestamp: 2020-09-01T15:40:06Z
Generation: 8
Managed Fields:
API Version: discovery.liqo.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
f:outgoing:
f:advertisement:
.:
f:apiVersion:
f:kind:
f:name:
f:uid:
Manager: advertisement-operator
Operation: Update
Time: 2020-09-01T15:40:20Z
API Version: discovery.liqo.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.:
f:cluster-id:
f:discovery-type:
f:spec:
.:
f:allowUntrustedCA:
f:apiUrl:
f:clusterID:
f:discoveryType:
f:join:
f:namespace:
f:status:
.:
f:incoming:
.:
f:availableIdentity:
f:identityRef:
.:
f:apiVersion:
f:kind:
f:name:
f:namespace:
f:uid:
f:joined:
f:outgoing:
.:
f:advertisementStatus:
f:caDataRef:
.:
f:apiVersion:
f:kind:
f:name:
f:namespace:
f:uid:
f:joined:
f:remote-peering-request-name:
f:ttl:
Manager: discovery
Operation: Update
Time: 2020-09-01T15:40:49Z
API Version: discovery.liqo.io/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:status:
f:incoming:
f:peeringRequest:
.:
f:name:
f:uid:
Manager: peering-request-operator
Operation: Update
Time: 2020-09-01T15:40:49Z
Resource Version: 1092
Self Link: /apis/discovery.liqo.io/v1alpha1/foreignclusters/8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
UID: e4ccbde0-f96a-4d80-bac8-776c87e49e02
Spec:
Allow Untrusted CA: true
API URL: https://10.0.2.5:6443
Cluster ID: 8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
Discovery Type: LAN
Join: true
Namespace: liqo
Status:
Incoming:
Available Identity: true
Identity Ref:
API Version: v1
Kind: Secret
Name: pr-d9df783b-cd9b-4d25-ae37-231e21dc9739
Namespace: liqo
UID: 9318644e-7ed4-49d9-a2bb-c1ccd50fe4c4
Joined: true
Peering Request:
Name: 8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
UID: dd249ebe-78cf-46b8-8c6f-8eb3810eea6f
Outgoing:
Advertisement:
API Version: sharing.liqo.io/v1alpha1
Kind: Advertisement
Name: advertisement-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
UID: 7b5e6737-7b9a-4966-9052-970bdb8c996b
Advertisement Status: Deleting
Ca Data Ref:
API Version: v1
Kind: Secret
Name: 8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-ca-data
Namespace: liqo
UID: 08efcb08-9384-476c-9bb2-e5dd2dc1bdfd
Joined: true
Remote - Peering - Request - Name: d9df783b-cd9b-4d25-ae37-231e21dc9739
Ttl: 3
Events: <none>
kubectl get secrets -n liqo
NAME TYPE DATA AGE
default-token-r5chc kubernetes.io/service-account-token 3 52m
tunnelendpointcreator-operator-service-account-token-rszjk kubernetes.io/service-account-token 3 52m
broadcaster-token-rkkx6 kubernetes.io/service-account-token 3 52m
peering-request-operator-token-qx4md kubernetes.io/service-account-token 3 52m
route-operator-service-account-token-t44br kubernetes.io/service-account-token 3 52m
sn-operator-token-4qknf kubernetes.io/service-account-token 3 52m
tunnel-operator-service-account-token-ksr46 kubernetes.io/service-account-token 3 52m
liqodash-admin-sa-token-lc7pv kubernetes.io/service-account-token 3 52m
discovery-sa-token-76r9c kubernetes.io/service-account-token 3 52m
crdreplicator-operator-service-account-token-zbkkm kubernetes.io/service-account-token 3 52m
podmutatoraccount-token-9l6z9 kubernetes.io/service-account-token 3 52m
advertisement-operator-token-xk6fn kubernetes.io/service-account-token 3 52m
sh.helm.release.v1.liqo.v1 helm.sh/release.v1 1 52m
ca-data Opaque 1 51m
pr-d9df783b-cd9b-4d25-ae37-231e21dc9739 Opaque 1 51m
8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-ca-data Opaque 1 51m
8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-8lbfs kubernetes.io/service-account-token 3 51m
peering-request-webhook-certs Opaque 2 51m
pod-mutator-secret Opaque 2 51m
liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq kubernetes.io/service-account-token 3 51m
Is your feature request related to a problem? Please describe.
Some comments, suggestions and issues which came to my mind when reading the Liqo user documentation:
Describe the bug
Two clusters have been peered (cluster1-cluster2) and the network connection between them has been established. Network anomalies
are observed in node cluster1-node1 when trying to communicate with services running in the cluster2 cluster. ICMP packets are perfectly routed but not tcp/udp traffic.
To Reproduce
kubectl get pods -n liqo-demo -o wide
kubectl exec -it -n liqo-demo podRunningOnSpring -- bash
Expected behavior
The services running on the cluster2 cluster should be reachable from the cluster1-node1 node.
Debug
kubectl get nodes --show-labels
and log-in to the node with label: liqo.io/gateway=true
(fall)sudo watch -n1 -d "iptables -vnxL -t nat | grep -v -e pkts -e Chain | sort -nk1 | tac | column -t"
and keep this terminal running SNAT all -- * gretun_ 0.0.0.0/0 0.0.0.0/0 to:10.244.0.0
Additional context
Add any other context about the problem here.
As observed, sometimes, Network Operators miss CRDs updates. This result in a missing network configuration across cluster.
Is your feature request related to a problem? Please describe.
In order to improve the testing of Liqo, we would like to add testing on real Liqo Deployments
The virtual kubelet deployment is created by the advertisement operator: the deployment object is hard-coded in the operator source code. This approach leads to a customization problem of the virtual kubelet flags parameters and worsens the maintainability of the virtual kubelet creation process. For these reasons, the virtual kubelet deployment declaration should be decoupled from the advertisement operator code.
Description
When we delete an Advertisement, the deployment of the linked virtual-kubelet is deleted as well because of OwnerReferece, and this deletion triggers the deletion of the virtual-node.
Doing so, the resources that have been created on the foreign cluster are not cleaned up: therefore, the behaviour we would like to have is the opposite:
Proposed solution
Set a finalizer on the Advertisement, which triggers the virtual-kubelet: it deletes the entry linked to the Advertisement in the NamespaceNattingTable. This triggers the deletion of all resources created on the foreign cluster by the reflector. After that, we can proceed with the deletion of the Advertisement (and its genealogy).
To do:
Sometimes the watchers in the VirtualKubelet fail to cast the triggering event into a TunnelEndpoint or Advertisement object.
When a casting error occurs, stop the watcher and recreate it; doing so, the watcher should restart working properly.
Is your feature request related to a problem? Please describe.
So far, in-going advertisement are automatically accepted. It would be more effective to have advertisement acceptance configuration, letting the user to accept or refuse the incoming advertisement.
Policies:
Describe the bug
The remote watcher in the broadcaster module fails to update the PeeringRequest
with the status of the Advertisement
(Accepted/Refused).
To Reproduce
Steps to reproduce the behavior:
kubectl logs -n liqo broadcaster-<clusterID>
Expected behavior
The PeeringRequest
should be updated with the status of the Advertisement
.
Describe the bug
When Liqo is uninstalled using the provided script (with --deleteCrd enabled), the procedure hangs forever. In particular, the CRDs networkconfigs.net.liqo.io
and tunnelendpoints.net.liqo.io
fail to be deleted since the corresponding resources are associated to finalizers that do not longer exist (since all pods have already been deleted in the previous step).
To Reproduce
Steps to reproduce the behavior:
--deleteCrd
)Expected behavior
The uninstallation should be completed correctly
Describe the bug
After joining a node, the tunnel-operator experience a crash while the TunnelEndpoint is not ready.
kubectl get po -n liqo -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
advertisement-operator-65cc9bb44f-7qhsm 1/1 Running 0 7m51s 10.200.1.5 liqo2-worker <none> <none>
crdreplicator-operator-6877454c8c-dltvt 1/1 Running 0 7m51s 10.200.1.6 liqo2-worker <none> <none>
discovery-7c747664c6-7b8fl 1/1 Running 0 7m51s 172.18.0.5 liqo2-worker <none> <none>
liqo-dashboard-7c955d968f-jfpwz 1/1 Running 0 7m51s 10.200.1.7 liqo2-worker <none> <none>
peering-request-operator-5f59d778c7-74mng 0/1 PodInitializing 0 7m50s 10.200.1.8 liqo2-worker <none> <none>
podmutator-64b9588fb-g98bf 0/1 PodInitializing 0 7m51s 10.200.1.2 liqo2-worker <none> <none>
route-operator-fvxft 1/1 Running 0 7m51s 172.18.0.4 liqo2-control-plane <none> <none>
route-operator-h6fzr 1/1 Running 0 7m51s 172.18.0.5 liqo2-worker <none> <none>
schedulingnode-operator-75474f96cc-cwhhj 1/1 Running 0 7m51s 10.200.1.3 liqo2-worker <none> <none>
tunnel-operator-76466f5bd9-j79nd 0/1 Error 0 7m51s 172.18.0.5 liqo2-worker <none> <none>
tunnelendpointcreator-operator-68dc8f78d-nczsv 1/1 Running 0 7m51s 10.200.1.4 liqo2-worker <none> <none>
virtual-kubelet-6575d0b9-6fba-4f7d-b890-a6417009cb64-6db56xhwmv 0/1 Init:0/1 0 8s <none> liqo2-worker <none> <none>
kubectl logs -n liqo tunnel-operator-76466f5bd9-j79nd
2020-09-19T09:12:31.402Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":0"}
2020-09-19T09:12:31.408Z INFO setup Starting manager as Tunnel-Operator
2020-09-19T09:12:31.409Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
2020-09-19T09:12:31.409Z INFO controller Starting EventSource {"reconcilerGroup": "net.liqo.io", "reconcilerKind": "TunnelEndpoint", "controller": "tunnelendpoint", "source": "kind source: /, Kind="}
2020-09-19T09:12:31.510Z INFO controller Starting Controller {"reconcilerGroup": "net.liqo.io", "reconcilerKind": "TunnelEndpoint", "controller": "tunnelendpoint"}
2020-09-19T09:12:31.510Z INFO controller Starting workers {"reconcilerGroup": "net.liqo.io", "reconcilerKind": "TunnelEndpoint", "controller": "tunnelendpoint", "worker count": 1}
2020-09-19T09:15:34.735Z DPANIC liqonetOperators.TunnelEndpoint odd number of arguments passed as key-value pairs for logging {"endpoint": "/tun-endpoint-6575d0b9-6fba-4f7d-b890-a6417009cb64", "ignored key": "is not ready"}
github.com/go-logr/zapr.handleFields
/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:106
github.com/go-logr/zapr.(*infoLogger).Info
/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:70
github.com/liqotech/liqo/internal/liqonet.(*TunnelController).Reconcile
/go/src/github.com/liqotech/liqo/internal/liqonet/tunnel-operator.go:59
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.Until
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90
E0919 09:15:34.736059 1 runtime.go:76] Observed a panic: odd number of arguments passed as key-value pairs for logging
goroutine 233 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x15d99e0, 0xc0005b4220)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x15d99e0, 0xc0005b4220)
/usr/local/go/src/runtime/panic.go:969 +0x166
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0002c8420, 0xc000048f40, 0x1, 0x1)
/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:230 +0x545
go.uber.org/zap.(*Logger).DPanic(0xc000652660, 0x188062f, 0x3d, 0xc000048f40, 0x1, 0x1)
/go/pkg/mod/go.uber.org/[email protected]/logger.go:215 +0x7f
github.com/go-logr/zapr.handleFields(0xc000652660, 0xc0002cc090, 0x3, 0x3, 0x0, 0x0, 0x0, 0x30, 0x15e97e0, 0x1)
/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:106 +0x5ce
github.com/go-logr/zapr.(*infoLogger).Info(0xc000117668, 0x183877b, 0xc, 0xc0002cc090, 0x3, 0x3)
/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:70 +0xb1
github.com/liqotech/liqo/internal/liqonet.(*TunnelController).Reconcile(0xc000367040, 0x0, 0x0, 0xc00034e680, 0x31, 0xc0001175c0, 0xc0005ee2d0, 0xc0005ee248, 0xc0005ee240)
/go/src/github.com/liqotech/liqo/internal/liqonet/tunnel-operator.go:59 +0x27f
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0000ca5a0, 0x16ae100, 0xc000117580, 0x0)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x284
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000ca5a0, 0x203000)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0000ca5a0)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00064e930)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00064e930, 0x1a60cc0, 0xc00052b890, 0x1, 0xc0004e6c00)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00064e930, 0x3b9aca00, 0x0, 0x1, 0xc0004e6c00)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00064e930, 0x3b9aca00, 0xc0004e6c00)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:170 +0x411
panic: odd number of arguments passed as key-value pairs for logging [recovered]
panic: odd number of arguments passed as key-value pairs for logging
goroutine 233 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x15d99e0, 0xc0005b4220)
/usr/local/go/src/runtime/panic.go:969 +0x166
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0002c8420, 0xc000048f40, 0x1, 0x1)
/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:230 +0x545
go.uber.org/zap.(*Logger).DPanic(0xc000652660, 0x188062f, 0x3d, 0xc000048f40, 0x1, 0x1)
/go/pkg/mod/go.uber.org/[email protected]/logger.go:215 +0x7f
github.com/go-logr/zapr.handleFields(0xc000652660, 0xc0002cc090, 0x3, 0x3, 0x0, 0x0, 0x0, 0x30, 0x15e97e0, 0x1)
/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:106 +0x5ce
github.com/go-logr/zapr.(*infoLogger).Info(0xc000117668, 0x183877b, 0xc, 0xc0002cc090, 0x3, 0x3)
/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:70 +0xb1
github.com/liqotech/liqo/internal/liqonet.(*TunnelController).Reconcile(0xc000367040, 0x0, 0x0, 0xc00034e680, 0x31, 0xc0001175c0, 0xc0005ee2d0, 0xc0005ee248, 0xc0005ee240)
/go/src/github.com/liqotech/liqo/internal/liqonet/tunnel-operator.go:59 +0x27f
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0000ca5a0, 0x16ae100, 0xc000117580, 0x0)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x284
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000ca5a0, 0x203000)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0000ca5a0)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00064e930)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00064e930, 0x1a60cc0, 0xc00052b890, 0x1, 0xc0004e6c00)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00064e930, 0x3b9aca00, 0x0, 0x1, 0xc0004e6c00)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00064e930, 0x3b9aca00, 0xc0004e6c00)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:170 +0x411
Describe the bug
After creating a SearchDomain
CR to trigger the discovery process, the peering process seems to be completed correctly. The virtual-kubelet pod is created, but it remains stuck in the Init status, waiting for the approval of a CSR. Once the CSR is manually approved, the pod starts running and the virtual node is created.
$ kubectl logs -n liqo virtual-kubelet-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-658dfxfcpj -c crt-generator
/etc/virtual-kubelet/certs
2020/09/10 09:34:15 [INFO] generate received request
2020/09/10 09:34:15 [INFO] received CSR
2020/09/10 09:34:15 [INFO] generating key: ecdsa-256
2020/09/10 09:34:15 [INFO] encoded CSR
certificatesigningrequest.certificates.k8s.io/virtual-kubelet-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-658dfxfcpj created
Wait for CSR to be signed
Waiting for approval of CSR: virtual-kubelet-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-658dfxfcpj
To Reproduce
Steps to reproduce the behavior:
SearchDomain
resource to trigger the discovery processExpected behavior
No manual intervention should be required.
Additional context
apiVersion: discovery.liqo.io/v1alpha1
kind: SearchDomain
metadata:
name: ***.polito.it
spec:
domain: ***.polito.it
autojoin: true
After Liqo installation, Kindnet crashes stating it is unable to reconcile routes. This is the function that seems to crash dealing with Liqonet: https://github.com/kubernetes-sigs/kind/blob/d7f948dd8c00084d6ee30eb953471ce3ce375455/images/kindnetd/cmd/kindnetd/main.go#L137
KindNet Logs:
I1004 11:25:59.390304 1 main.go:65] hostIP = 172.18.0.6
podIP = 172.18.0.6
I1004 11:25:59.390670 1 main.go:74] setting mtu 1500 for CNI
I1004 11:25:59.778503 1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:25:59.778531 1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24
I1004 11:25:59.778830 1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:25:59.778843 1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:25:59.778849 1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24
I1004 11:25:59.778936 1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:26:00.779140 1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:26:00.779363 1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24
I1004 11:26:00.779707 1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:26:02.779986 1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:26:02.780022 1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24
I1004 11:26:02.780227 1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:26:05.780436 1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:26:05.780464 1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24
I1004 11:26:05.780638 1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
panic: Maximum retries reconciling node routes: network is unreachable
goroutine 1 [running]:
main.main()
/go/src/cmd/kindnetd/main.go:128 +0x893
Ip Route:
default via 172.18.0.1 dev eth0
10.75.0.0/16 via 192.168.200.7 dev liqonet
10.141.0.0/16 via 192.168.200.7 dev liqonet
10.200.0.2 dev vethc8f0b0ee scope host
10.200.0.3 dev veth7ad31e05 scope host
10.200.0.4 dev veth6292c0c6 scope host
10.200.1.0/24 via 172.18.0.7 dev eth0
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.6
192.168.200.0/24 dev liqonet proto kernel scope link src 192.168.200.6
Iptables:
root@liqo-cluster1-control-plane:/# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
LIQO-INPUT udp -- anywhere anywhere udp
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
LIQO-FORWARD all -- anywhere anywhere
KUBE-FORWARD all -- anywhere anywhere /* kubernetes forwarding rules */
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
Chain KUBE-EXTERNAL-SERVICES (1 references)
target prot opt source destination
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- anywhere anywhere /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP all -- !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT
Chain KUBE-FORWARD (1 references)
target prot opt source destination
DROP all -- anywhere anywhere ctstate INVALID
ACCEPT all -- anywhere anywhere /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT all -- anywhere anywhere /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain KUBE-PROXY-CANARY (0 references)
target prot opt source destination
Chain KUBE-SERVICES (3 references)
target prot opt source destination
Chain LIQO-FORWARD (1 references)
target prot opt source destination
LIQO-FRWD-CLS-f38be2c9 all -- anywhere 10.141.0.0/16
LIQO-FRWD-CLS-3dd6f45d all -- anywhere 10.75.0.0/16
Chain LIQO-FRWD-CLS-3dd6f45d (1 references)
target prot opt source destination
ACCEPT all -- anywhere 10.75.0.0/16
Chain LIQO-FRWD-CLS-f38be2c9 (1 references)
target prot opt source destination
ACCEPT all -- anywhere 10.141.0.0/16
Chain LIQO-INPT-CLS-3dd6f45d (1 references)
target prot opt source destination
ACCEPT all -- 10.200.0.0/16 10.75.0.0/16
Chain LIQO-INPT-CLS-f38be2c9 (1 references)
target prot opt source destination
ACCEPT all -- 10.200.0.0/16 10.141.0.0/16
Chain LIQO-INPUT (1 references)
target prot opt source destination
ACCEPT udp -- anywhere anywhere udp dpt:4789
LIQO-INPT-CLS-f38be2c9 all -- anywhere 10.141.0.0/16
LIQO-INPT-CLS-3dd6f45d all -- anywhere 10.75.0.0/16
Is your feature request related to a problem? Please describe.
Add a "need help" section in getting started and a link at the end of each user documentation page.
Describe the bug
Whatever LIQO_VERSION
is selected, the dashboard is always installed using the latest
tag. This is inconsistent with the rest of the components and raises the usual concerns regarding the usage of the latest tag (e.g. a restart may cause a new version to be used).
To Reproduce
curl -sL https://raw.githubusercontent.com/liqotech/liqo/master/install.sh | LIQO_VERSION=... bash
kubectl get po liqo-dashboard-7977f68bc4-ml4sg -o yaml | grep ' image:'
Expected behavior
Since the dashboard is located on a different repository, the commit sha will never be the same of the other components. However, I believe that the tagged versions should be kept aligned between the two repo to avoid confusions, while for installations from master the latest commit tag should be retrieved and used as for the other components.
Is your feature request related to a problem? Please describe.
So far, the Liqo implementation of virtual-kubelet still misses support for several features:
Virtual Kubelet Interface:
Reflector Issues:
Liqo Exploration:
Is your feature request related to a problem? Please describe.
So far, it is very complex to correctly uninstall all the liqo resources.
Describe the solution you'd like
Similarly to install.sh script, we should have a uninstall.sh script to safely remove all liqo objects.
Describe alternatives you've considered
Liqo uninstall is uncomfortable and requires to have helminstalled and liqo repository downloaded.
Steps:
Describe the bug
Due to the change introduced by endpointSlice in Kubernetes 1.19, endpoint reflection is not working anymore. This is mainly due to kube-proxy which now uses EndpointSlices to configure services instead of Endpoints.
Is your feature request related to a problem? Please describe.
The advertisement is the base resource to establish peering between clusters. It represent the basic resource used to create the virtual nodes. So far, we have a pretty simple approach to generate the Advertisement.
Describe the solution you'd like
We should make possible to:
So far many Liqo components are relying on a "cluster admin" ClusterRole, which is not necessary every time. We should move to better tailored clusterRoles for each component.
Components Affected:
Some reflection improvements that could handle some corner cases or complete the event management:
Is your feature request related to a problem? Please describe.
The name of the configmap that contains the clusterID is hardcoded. Throughout the code when the configmap is needed the name of the configmap is also hardcoded.
Describe the solution you'd like
The possibility to set the name of the configmap in the clusterConfig CRD. So all the components that need the clusterID can find the name of the configmap in the configuration.
The network configuration is exchanged between two peering clusters using the advertisement.protocol.liqo.io
CRD. It was fine before the discovery protocol was implemented. In some cases the advertisement.protocol.liqo.io
CRD is not symmetrically exchanged between two peering clusters, having a state where one of the two clusters is missing the network configuration of the other cluster. The following pictures depicts the three different states:
A possible solution could be to separate the network parameters from the advertisement protocol, and exchange them using a different CRD having in such manner symmetric sharing of the network configuration between two peering clusters.
At this point the same steps are performed by the peering cluster and in Cluster 1 we have two CRDs: TEP1->2, TEP2->1. The first one contains the information of the local cluster sent to the peering cluster in the spec
section and in the status
we have the NATing information given by the remote cluster (Cluster2). The second one in its spec
has the network parameters of the Cluster2. Combining the status
of TEP1->2 and the spec
of TEP2->1 we have all the needed information to establish a connection with the remote Cluster2
status
of TEP1->2 and the spec
of TEP2->1 we have all the needed information to establish a connection with the remote Cluster2The Dispatchers are in charge to reflect the changes of TEP CRD between two cluster. Having a bidirectional channel of communication with a remote cluster permits to reflect only the the spec
section of a local CRD toward remote cluster. The other channel is used to reflect the status
of the copy of the local CRD, that lives in a remote cluster, in to the local cluster. The blue arrows indicates that the connection handles only the spec
fields and the red arrows stands for the connection handling the status
fields.
NetworkConfig CRD(#154)
Init PR of CRDReplicator(#162)
CRDReplicator enhancement 1(#195)
CRDReplicator enhancement 2(#223)
Update tunnelEndpointCreator(#218)
Is your feature request related to a problem? Please describe.
As discussed in last meetings, most APIs are still in version 1 which is kind of premature is this phase. They should move to v1alpha1.
Families:
Describe the bug
Upon the reboot of a node in the cluster, routes for external traffic are not re-created in Liqo.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Routes should be re-created after a node become unavailable.
This issue describes the virtual kubelet lifecycle and the pod lifecycle. The last section ends up with a bullet list of the known problems related to the remote pod status reconciliation that can be updated and should be solved in the next PRs.
At boot time, the virtual kubelet fetches from the etcd a CR of kind namespacenattingtable
(or creates it if it doesn't exist) that contains the natting table of the namespaces for the given virtual node, i.e., the translation between local namespaces and remote namespaces. Every time a new entry is added in this natting table, a new reflection routine for that namespace is triggered; this routine implies:
service
endpoints
configmap
secret
The reflection of the resource implies that each local resource is translated (if needed) and reflected remotely, such that a pod in the remote namespace has a complete view of the local namespace resources as if it was local.
The remote pod-watcher is a routine that listens for all the events related to a remotely offloaded pod in a given translated namespace; this is needed to reconcile the remote status with the local one, such that the local cluster always knows in which state each offloaded pod is. There are some remote status transitions that trigger the providerFailed
status in the local pod instance: providerFailed
means that the local status cannot be correctly updated because of an unrecognized remote status transition. We need to deeper investigate for understanding when and why this status is triggered and to avoid it as much as possible.
The currently known reasons that trigger this status are:
RouteOperator
TunnelOperator
All modules
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.