ovirt / csi-driver Goto Github PK
View Code? Open in Web Editor NEWoVirt's CSI driver for Openshift and Kubernetes
License: Other
oVirt's CSI driver for Openshift and Kubernetes
License: Other
Hey everyone,
So I've had a couple of times when I have restarted pods and then though had been going crazy when the pod comes back up but the volume is empty. I had previously chalked it up to buggy containers, or human error. I've just been having a play with deploying something and have caught the csi-node pod formatting (at least attempting to) the PVC multiple times, and the mkfs failing. The net result for this specific PV is that the data is wiped.
Logs: ovirt-csi-node-qh4gg-ovirt-csi-driver.log
In order to repeat the behaviour:
0) have an okd 4.4-beta4 cluster running on oVirt 4.3.9 (I dont think the oVirt version matters here)
I hope that I am just being an idiot and doing something wrong, but I can't for the life of me spot it. I have nothing different on the cluster away from defaults for the SC or PVC and have seen this behavior on multiple clusters.
Cheers
Craig
It'd be great to document how do I request a PVC using this CSI driver.
Hi all,
Sorry if this isn't the right place for this bug ticket. I am using @rgolangh 's quay.io ovirt-csi-driver image which was updated today. Part of the update seems to have broken the ovirt-csi-driver container from running in the pod with the error:
Failed to pull image "quay.io/rgolangh/ovirt-csi-driver:latest": rpc error: code = Unknown desc = Image operating system mismatch: image uses "", expecting "linux"
Many thanks
Craig
Hello, is this repository up2date or is being replaced by https://github.com/openshift/ovirt-csi-driver?
In internal/ovirt/ovirt.go the connection is done this way:
func newOvirtConnection() (*ovirtsdk.Connection, error) {
ovirtConfig, err := GetOvirtConfig()
if err != nil {
return nil, err
}
connection, err := ovirtsdk.NewConnectionBuilder().
URL(ovirtConfig.URL).
Username(ovirtConfig.Username).
Password(ovirtConfig.Password).
CAFile(ovirtConfig.CAFile).
Insecure(ovirtConfig.Insecure).
Build()
if err != nil {
return nil, err
}
return connection, nil
}
A valid managed ovirt-secret has the following components:
$ oc get -o yaml secret ovirt-credentials
apiVersion: v1
data:
ovirt_ca_bundle: LS0tLS1...
ovirt_cafile: ""
ovirt_insecure: Zm...
ovirt_password: cm....
ovirt_url: aHR....
ovirt_username: b2N...
Which fails as the embedded CA is missing.
When trying to deploy Wordpress from
on OKD 4.4 on oVirt 4.3 - oVirt CSI driver
getting the following error after PVC bound Successfully
MountVolume.MountDevice failed for volume "pvc-620d2db6-d845-4b4c-9220-d7e372d30397" : rpc error: code = Unknown desc = exit status 32lsblk failed with lsblk: .: not a block device
When scaling down a cluster by reducing the machineset replica count i.e from 6 to 3, any disks which are attached to the host at the time are deleted as part of removing the VM within oVirt. I suspect this is caused by having a bit of delay between workload being de-scheduled and the process to I would expect that disks are detached before the machine is removed.
Okay so the title is a bit misleading. I am seeing behavior where the ovirt hosted-engine hangs on a regular basis due to garbage collection not being quick enough to handle the number of new sessions. This is being tracked as okd-project/okd#110.
This issue is bringing out a behaviour where if any requests to attach/detach/create/remove disks via the ovirt csi plugin, then the hosted-engine is marked internally as not available until the ovirt-csi-driver and ovirt-csi-node pods are all restarted.
Volume creations time out with an api error (I will reproduce and paste the error in shortly)
pvc event log:
PersistentVolumeClaimPVCmysqltest
NamespaceNSaaa-test
Mar 26, 4:43 pm
Generated from persistentvolume-controller
3 times in the last few seconds
waiting for a volume to be created, either by external provisioner "csi.ovirt.org" or manually created by system administrator
PersistentVolumeClaimPVCmysqltest
NamespaceNSaaa-test
Mar 26, 4:43 pm
Generated from csi.ovirt.org_ovirt-csi-plugin-0_5414b70e-3116-45db-978a-b0db3dff57ed
5 times in the last few seconds
External provisioner is provisioning volume for claim "aaa-test/mysqltest"
PersistentVolumeClaimPVCmysqltest
NamespaceNSaaa-test
Mar 26, 4:43 pm
Generated from csi.ovirt.org_ovirt-csi-plugin-0_5414b70e-3116-45db-978a-b0db3dff57ed
5 times in the last few seconds
failed to provision volume with StorageClass "ovirt-csi-sc": rpc error: code = Unknown desc = Tag not matched: expect <fault> but got <html>
Volume attach reports MountVolume.MountDevice failed for volume "pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772" : rpc error: code = Unknown desc = exit status 32lsblk failed with lsblk: .: not a block device
Its worth noting that this behaviour starts when the hosted-engine is unavailable and continues until all the pods in ovirt-csi-driver are restarted.
csi-external-attacher log:
I0326 16:38:09.492850 1 csi_handler.go:111] "csi-61ca10562a339825f41f6dfe5550ff89848b3f42b0f704e9b51b9abcba3aa112" is already attached
I0326 16:38:09.493331 1 csi_handler.go:105] CSIHandler: finished processing "csi-61ca10562a339825f41f6dfe5550ff89848b3f42b0f704e9b51b9abcba3aa112"
I0326 16:38:09.492524 1 csi_handler.go:105] CSIHandler: finished processing "csi-04b75219a12949c66395fa4b85c742e25e028d7c40d9552cc7cec228f0312017"
I0326 16:38:09.524874 1 csi_handler.go:428] Saving detach error to "csi-7b64936be16eb0077d5933a80e0cd4b129f32c62a6d3d7ca8cd95753618260b5"
I0326 16:38:09.524917 1 csi_handler.go:428] Saving detach error to "csi-a1c3862511e8044fa46773f3499d70ba1c3f6f4a35ba1f1958b0cdbd0570d0df"
I0326 16:38:09.524874 1 csi_handler.go:428] Saving detach error to "csi-a833360681cc09e5eb167799f0f92176f7914aaf52f6bfe1241af4cc00ef9aa7"
I0326 16:38:09.609236 1 csi_handler.go:439] Saved detach error to "csi-7b64936be16eb0077d5933a80e0cd4b129f32c62a6d3d7ca8cd95753618260b5"
I0326 16:38:09.609336 1 csi_handler.go:99] Error processing "csi-7b64936be16eb0077d5933a80e0cd4b129f32c62a6d3d7ca8cd95753618260b5": failed to detach: rpc error: code = Unknown desc = Tag not matched: expect <fault> but got <html>
I0326 16:38:09.609925 1 controller.go:141] Ignoring VolumeAttachment "csi-a1c3862511e8044fa46773f3499d70ba1c3f6f4a35ba1f1958b0cdbd0570d0df" change
I0326 16:38:09.610107 1 controller.go:141] Ignoring VolumeAttachment "csi-7b64936be16eb0077d5933a80e0cd4b129f32c62a6d3d7ca8cd95753618260b5" change
I0326 16:38:09.610248 1 controller.go:141] Ignoring VolumeAttachment "csi-a833360681cc09e5eb167799f0f92176f7914aaf52f6bfe1241af4cc00ef9aa7" change
I0326 16:38:09.620577 1 csi_handler.go:439] Saved detach error to "csi-a833360681cc09e5eb167799f0f92176f7914aaf52f6bfe1241af4cc00ef9aa7"
I0326 16:38:09.620671 1 csi_handler.go:99] Error processing "csi-a833360681cc09e5eb167799f0f92176f7914aaf52f6bfe1241af4cc00ef9aa7": failed to detach: rpc error: code = Unknown desc = Tag not matched: expect <fault> but got <html>
I0326 16:38:09.621063 1 csi_handler.go:439] Saved detach error to "csi-a1c3862511e8044fa46773f3499d70ba1c3f6f4a35ba1f1958b0cdbd0570d0df"
I0326 16:38:09.621110 1 csi_handler.go:99] Error processing "csi-a1c3862511e8044fa46773f3499d70ba1c3f6f4a35ba1f1958b0cdbd0570d0df": failed to detach: rpc error: code = Unknown desc = Tag not matched: expect <fault> but got <html>
I0326 16:38:57.504856 1 reflector.go:370] k8s.io/client-go/informers/factory.go:133: Watch close - *v1beta1.VolumeAttachment total 6 items received
csi-plugin log:
E0326 16:36:42.106809 1 server.go:125] /csi.v1.Controller/ControllerUnpublishVolume returned with error: failed to find attachment by disk a772eca5-8fac-44c4-bbb3-b2fdd2f1ada7 for VM 3e2def01-8fd5-4b8f-8ddb-b208245360bf
E0326 16:36:42.108682 1 server.go:125] /csi.v1.Controller/ControllerUnpublishVolume returned with error: failed to find attachment by disk 7323fa9a-bd56-428a-af5c-fd6219363910 for VM 3e2def01-8fd5-4b8f-8ddb-b208245360bf
E0326 16:36:42.113144 1 server.go:125] /csi.v1.Controller/ControllerUnpublishVolume returned with error: failed to find attachment by disk 7323fa9a-bd56-428a-af5c-fd6219363910 for VM cd2f5c14-359c-4351-97ce-b52070603e61
I0326 16:38:09.497430 1 controller.go:136] Detaching Disk 7323fa9a-bd56-428a-af5c-fd6219363910 from VM 3e2def01-8fd5-4b8f-8ddb-b208245360bf
I0326 16:38:09.497431 1 controller.go:136] Detaching Disk 7323fa9a-bd56-428a-af5c-fd6219363910 from VM cd2f5c14-359c-4351-97ce-b52070603e61
I0326 16:38:09.497430 1 controller.go:136] Detaching Disk a772eca5-8fac-44c4-bbb3-b2fdd2f1ada7 from VM 3e2def01-8fd5-4b8f-8ddb-b208245360bf
E0326 16:38:09.523549 1 server.go:125] /csi.v1.Controller/ControllerUnpublishVolume returned with error: Tag not matched: expect <fault> but got <html>
E0326 16:38:09.524180 1 server.go:125] /csi.v1.Controller/ControllerUnpublishVolume returned with error: Tag not matched: expect <fault> but got <html>
E0326 16:38:09.524287 1 server.go:125] /csi.v1.Controller/ControllerUnpublishVolume returned with error: Tag not matched: expect <fault> but got <html>
ovirt-node log:
I0326 16:38:15.959322 1 node.go:89] Unmounting /var/lib/kubelet/pods/6c09c82b-5383-42d2-9950-eb4904efd5fc/volumes/kubernetes.io~csi/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/mount
I0326 16:38:16.165727 1 node.go:40] Staging volume e19b402e-5783-4179-9f65-4bc289f44f62 with volume_id:"e19b402e-5783-4179-9f65-4bc289f44f62" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1584894930586-8081-csi.ovirt.org" >
E0326 16:38:16.191257 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: exit status 32lsblk failed with lsblk: .: not a block device
I0326 16:38:16.760689 1 node.go:40] Staging volume e19b402e-5783-4179-9f65-4bc289f44f62 with volume_id:"e19b402e-5783-4179-9f65-4bc289f44f62" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1584894930586-8081-csi.ovirt.org" >
E0326 16:38:16.781994 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: exit status 32lsblk failed with lsblk: .: not a block device
I0326 16:38:17.874876 1 node.go:40] Staging volume e19b402e-5783-4179-9f65-4bc289f44f62 with volume_id:"e19b402e-5783-4179-9f65-4bc289f44f62" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1584894930586-8081-csi.ovirt.org" >
E0326 16:38:17.897662 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: exit status 32lsblk failed with lsblk: .: not a block device
I0326 16:38:20.012277 1 node.go:40] Staging volume e19b402e-5783-4179-9f65-4bc289f44f62 with volume_id:"e19b402e-5783-4179-9f65-4bc289f44f62" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1584894930586-8081-csi.ovirt.org" >
E0326 16:38:20.035193 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: exit status 32lsblk failed with lsblk: .: not a block device
I0326 16:38:24.117533 1 node.go:40] Staging volume e19b402e-5783-4179-9f65-4bc289f44f62 with volume_id:"e19b402e-5783-4179-9f65-4bc289f44f62" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1584894930586-8081-csi.ovirt.org" >
E0326 16:38:24.140299 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: exit status 32lsblk failed with lsblk: .: not a block device
I0326 16:38:32.225132 1 node.go:40] Staging volume e19b402e-5783-4179-9f65-4bc289f44f62 with volume_id:"e19b402e-5783-4179-9f65-4bc289f44f62" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9c80df7d-8f33-4cfc-8810-ea12b4006772/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1584894930586-8081-csi.ovirt.org" >
E0326 16:38:32.250849 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: exit status 32lsblk failed with lsblk: .: not a block device
I would have expected that the service would continue to retry the hosted-engine until it becomes available again and then continue to function.
Hope this helps
Cheers
Craig
I am running the csi-driver without openshift but have placed the secret in the namespace with the driver.
When attempting to provision the example I am seeing the following issue.
The drive is provisioned in ovirt and is attached to the correct worker the pod is scheduled on but the ovirt-csi-driver container on that worker is showing the following.
I0508 05:53:22.835356 1 node.go:134] Extracting pvc volume name aefb7ca6-ee05-485e-b186-9d93dd22f33a
I0508 05:53:22.868735 1 node.go:141] Extracted pvc volume name aefb7ca6-ee05-485e-b
E0508 05:53:22.869221 1 server.go:125] /csi.v1.Node/NodeStageVolume returned with error: lstat /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_aefb7ca6-ee05-485e-b: no such file or directory
I0508 05:55:24.908890 1 node.go:40] Staging volume aefb7ca6-ee05-485e-b186-9d93dd22f33a with volume_id:"aefb7ca6-ee05-485e-b186-9d93dd22f33a" staging_target_path:"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4c5f76b8-7b72-460c-b61a-5e6df5247298/globalmount" volume_capability:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:"storage.kubernetes.io/csiProvisionerIdentity" value:"1588915773341-8081-csi.ovirt.org" >
A shell on the ovirt-csi-driver container on that worker can see the following
[root@ovirt-csi-node-4qcxg /]# ls /dev/disk/by-id/ | grep aefb7ca6
scsi-0QEMU_QEMU_HARDDISK_aefb7ca6-ee05-485e-b186-9d93dd22f33a
It is looking it is having a issue when extracting the name and truncating the end of it and then having issues finding it.
The base image needs to have e2fsprogs, to allow running mkfs.ext4 and other file systems.
It looks like the current version doesn't support RWX. The default storage class should be defined to not allow RWX (with an error)
On a newly installed OKD 4.4 cluster on RHV 4.3.9, oc create -f deploy/csi-driver
yields:
csidriver.storage.k8s.io/csi.ovirt.org created
namespace/openshift-ovirt-csi-operator created
clusterrolebinding.rbac.authorization.k8s.io/ovirt-csi-controller-provisioner-binding created
clusterrolebinding.rbac.authorization.k8s.io/ovirt-csi-controller-attacher-binding created
clusterrole.rbac.authorization.k8s.io/ovirt-csi-controller-cr created
clusterrole.rbac.authorization.k8s.io/ovirt-csi-node-cr created
clusterrole.rbac.authorization.k8s.io/openshift:csi-driver-controller-leader-election created
clusterrolebinding.rbac.authorization.k8s.io/ovirt-csi-controller-binding created
clusterrolebinding.rbac.authorization.k8s.io/ovirt-csi-leader-binding created
clusterrolebinding.rbac.authorization.k8s.io/ovirt-csi-node-binding created
clusterrolebinding.rbac.authorization.k8s.io/ovirt-csi-node-leader-binding created
credentialsrequest.cloudcredential.openshift.io/ovirt-csi-driver created
Error from server (NotFound): error when creating "deploy/csi-driver/020-autorization.yaml": namespaces "ovirt-csi-driver" not found
Error from server (NotFound): error when creating "deploy/csi-driver/020-autorization.yaml": namespaces "ovirt-csi-driver" not found
Error from server (NotFound): error when creating "deploy/csi-driver/030-node.yaml": namespaces "ovirt-csi-driver" not found
Error from server (NotFound): error when creating "deploy/csi-driver/040-controller.yaml": namespaces "ovirt-csi-driver" not found
Error from server (AlreadyExists): error when creating "deploy/csi-driver/060-credential-request.yaml": credentialsrequests.cloudcredential.openshift.io "ovirt-csi-driver" already exists
Should the manifests be adapted to create the ovirt-csi-driver
namespace? Or should instances of that namespace be changed to openshift-ovirt-csi-operator
? Also, I don't see an operator installed / created as a result of this command. Is that the expected behavior?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.