embercsi / ember-csi Goto Github PK
View Code? Open in Web Editor NEWMulti-vendor CSI plugin supporting over 80 storage drivers
License: Other
Multi-vendor CSI plugin supporting over 80 storage drivers
License: Other
Make sure that the service can gracefully shutdown.
Currently we have a brute shutdown on KeyboardInterrupt
that will not give any grace period for ongoing operations to complete.
We need to have a configurable value and be able to gracefully shutdown when the CO requests it.
We should have a metadata persistent plugin specific for Kubernetes using CRDs.
This pluging could live inside the cinderlib-CSI driver or be an external Python Library.
All concurrent requests are sharing the same request ID, so when we receive a new request it overwrites the request ID of the previous thread and they all start reporting the same ID.
Timeline:
There is no CI, and we should add at least one CI job on Travis.
This job could use the LVM driver to run the tests.
When we do a ControllerPublishVolume
we get the following warning:
/home/geguileo/code/reuse-cinder-drivers/cinderlib-csi/.venv/lib/python2.7/site-packages/oslo_versionedobjects/fields.py:368: FutureWarning: u'think.localdomain' is an invalid UUID. Using UUIDFields with invalid UUIDs is no longer supported, and will be removed in a future release. Please update your code to input valid UUIDs or accept ValueErrors for invalid UUIDs. See https://docs.openstack.org/oslo.versionedobjects/latest/reference/fields.html#oslo_versionedobjects.fields.UUIDField for further details
FutureWarning)
We shouldn't be getting any warnings unrelated to the CSI plugin.
The plugin doesn't have a logging mechanism, only a couple of print statements.
It should have a proper logging mechanism and more data should be logged.
The logging mechanism should be tied to the logging mechanism provided by cinderlib (coming from oslo.logging) if possible. That way even internal driver information will be logged.
CSI v1.1 is out, and includes the new Expand Volume feature, and we should add this feature to Ember-CSI and also check if there have been any other changes to the spec that may require additional changes.
Hi,
Using cinderlib/CSI with either XtremIO or RBD, I see the following issue: attach seems to work but mount fails with the following error:
MountVolume.MountDevice failed for volume "pvc-9bb7c9a98c3e11e8" : rpc error: code = Unavailable desc = grpc: the connection is unavailable
[root@kt-c7kb9 contrib]# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
kclaim Bound pvc-9bb7c9a98c3e11e8 1Gi RWO cinderlib 3m
[root@kt-c7kb9 contrib]# oc get all
NAME READY STATUS RESTARTS AGE
pod/csi-cinderlib-controller-0 3/3 Running 0 4m
pod/csi-cinderlib-ds-hqm4w 2/2 Running 0 4m
pod/sleep 0/1 ContainerCreating 0 4m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/csi-cinderlib-ds 1 1 1 1 1 4m
NAME DESIRED CURRENT AGE
statefulset.apps/csi-cinderlib-controller 1 1 4m
[root@kt-c7kb9 contrib]#
[root@kt-c7kb9 contrib]#
[root@kt-c7kb9 contrib]#
[root@kt-c7kb9 contrib]#
[root@kt-c7kb9 contrib]# oc describe pod sleep
Name: sleep
Namespace: csi
Node: localhost/10.19.139.83
Start Time: Fri, 20 Jul 2018 17:02:26 +0000
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"sleep","namespace":"csi"},"spec":{"containers":[{"args":["sleep","1000000"],"image...
openshift.io/scc=anyuid
Status: Pending
IP:
Containers:
busybox:
Container ID:
Image: busybox
Image ID:
Port:
Host Port:
Args:
sleep
1000000
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
Mounts:
/tmp/mysql from myclaim-mount (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5krt9 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
myclaim-mount:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: kclaim
ReadOnly: false
default-token-5krt9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5krt9
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations:
Events:
Type Reason Age From Message
Normal Scheduled 4m default-scheduler Successfully assigned csi/sleep to localhost
Normal SuccessfulAttachVolume 4m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-9bb7c9a98c3e11e8"
Warning FailedMount 6s (x10 over 4m) kubelet, localhost MountVolume.MountDevice failed for volume "pvc-9bb7c9a98c3e11e8" : rpc error: code = Unavailable desc = grpc: the connection is unavailable
Warning FailedMount 6s (x2 over 2m) kubelet, localhost Unable to mount volumes for pod "sleep_csi(adda6e0b-8c3e-11e8-99ed-0017a4772454)": timeout expired waiting for volumes to attach or mount for pod "csi"/"sleep". list of unmounted volumes=[myclaim-mount]. list of unattached volumes=[myclaim-mount default-token-5krt9]
I also see the following in kubelet logs:
E0720 17:06:44.675459 7093 kubelet.go:1610] Unable to mount volumes for pod "sleep_csi(adda6e0b-8c3e-11e8-99ed-0017a4772454)": timeout expired waiting for volumes to attach or mount for pod "csi"/"sleep". list of unmounted volumes=[myclaim-mount]. list of unattached volumes=[myclaim-mount default-token-5krt9]; skipping pod
E0720 17:06:44.675494 7093 pod_workers.go:186] Error syncing pod adda6e0b-8c3e-11e8-99ed-0017a4772454 ("sleep_csi(adda6e0b-8c3e-11e8-99ed-0017a4772454)"), skipping: timeout expired waiting for volumes to attach or mount for pod "csi"/"sleep". list of unmounted volumes=[myclaim-mount]. list of unattached volumes=[myclaim-mount default-token-5krt9]
I0720 17:06:44.917838 7093 operation_generator.go:489] MountVolume.WaitForAttach entering for volume "pvc-9bb7c9a98c3e11e8" (UniqueName: "kubernetes.io/csi/com.redhat.cinderlib-csi^d8510041-dca7-4f10-8dd2-bf8612f1007d") pod "sleep" (UID: "adda6e0b-8c3e-11e8-99ed-0017a4772454") DevicePath "csi-bef4cd4d9c8fb774e7572baa224f885b2e3294a9f46fcdcf218c285114b8e859"
I0720 17:06:44.919864 7093 operation_generator.go:498] MountVolume.WaitForAttach succeeded for volume "pvc-9bb7c9a98c3e11e8" (UniqueName: "kubernetes.io/csi/com.redhat.cinderlib-csi^d8510041-dca7-4f10-8dd2-bf8612f1007d") pod "sleep" (UID: "adda6e0b-8c3e-11e8-99ed-0017a4772454") DevicePath "csi-bef4cd4d9c8fb774e7572baa224f885b2e3294a9f46fcdcf218c285114b8e859"
E0720 17:06:44.920299 7093 csi_attacher.go:282] kubernetes.io/csi: attacher.MountDevice failed to check STAGE_UNSTAGE_VOLUME: rpc error: code = Unavailable desc = grpc: the connection is unavailable
E0720 17:06:44.920372 7093 nestedpendingoperations.go:267] Operation for ""kubernetes.io/csi/com.redhat.cinderlib-csi^d8510041-dca7-4f10-8dd2-bf8612f1007d"" failed. No retries permitted until 2018-07-20 17:08:46.920347416 +0000 UTC m=+11877.628616092 (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume "pvc-9bb7c9a98c3e11e8" (UniqueName: "kubernetes.io/csi/com.redhat.cinderlib-csi^d8510041-dca7-4f10-8dd2-bf8612f1007d") pod "sleep" (UID: "adda6e0b-8c3e-11e8-99ed-0017a4772454") : rpc error: code = Unavailable desc = grpc: the connection is unavailable"
Any idea what might be causing this?
Thanks
Kiran
All our testing relies on functional tests. We should also add unit tests to increase coverage.
The CI is already running the single existing unit test that we have, so all that's left to do is to create the tests and they will be automatically run by the gate.
Block volume attach and detach are not working since we are making some assumptions that are not correct and were clarified in the CSI spec v1.0.
Our incorrect assumptions were:
In reality it is like this:
Per the CSI spec the name of a CSI driver must conform to the following requirements:
The name MUST follow domain name notation format (https://tools.ietf.org/html/rfc1035#section-2.3.1).
It SHOULD include the plugin's host company name and the plugin name, to minimize the possibility of collisions.
It MUST be 63 characters or less, beginning and ending with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), dots (.), and alphanumerics between.
This field is REQUIRED.
CSI 0.x required reverse domain notation. But CSI 1.0 dropped the "reverse" part of the requirement. The expectation is that the name will be normal (forward) domain notation.
So for CSI 1.0, instead of io.ember-csi.[x]
the driver name should be [x].ember-csi.io
Hi @kirankt ,
I tried to test ember-csi following the document: https://github.com/kirankt/ember-csi/tree/master/examples/openshift. Once I created the pvc, the running csi-controller got to crash, the csi-driver and external-attacher can't run, so the pvc is always pending.
Here is the ceph.conf and keyring: http://pastebin.test.redhat.com/643939
[root@qe-shlao-ceph-openstack-master-etcd-1 openshift]# oc describe pvc
Name: ember-csi-pvc
Namespace: ember-csi
StorageClass: ember-csi-sc
Status: Pending
Volume:
Labels: <none>
Annotations: volume.beta.kubernetes.io/storage-provisioner=io.ember-csi
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ExternalProvisioning 1m (x26 over 7m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "io.ember-csi" or manually created by system administrator
[root@qe-shlao-ceph-openstack-master-etcd-1 openshift]# oc get all
NAME READY STATUS RESTARTS AGE
pod/csi-controller-0 2/3 CrashLoopBackOff 2 1m
pod/csi-node-gx8np 2/2 Running 0 1h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/csi-node 3 1 1 1 1 <none> 1h
NAME DESIRED CURRENT AGE
statefulset.apps/csi-controller 1 1 1h
[root@qe-shlao-ceph-openstack-master-etcd-1 openshift]# oc describe pod/csi-controller-0
Name: csi-controller-0
Namespace: ember-csi
Priority: 0
PriorityClassName: <none>
Node: qe-shlao-ceph-openstack-node-1/172.16.122.87
Start Time: Mon, 10 Sep 2018 12:44:43 -0400
Labels: app=csi-controller
controller-revision-hash=csi-controller-6844c8bc
statefulset.kubernetes.io/pod-name=csi-controller-0
Annotations: openshift.io/scc=ember-csi-scc
Status: Running
IP: 172.16.122.87
Controlled By: StatefulSet/csi-controller
Containers:
external-provisioner:
Container ID: docker://e07dd8bfdabeee6e2020ee33c19cfd7736f4ba40379fc96aae56d6fec0d07b3f
Image: quay.io/k8scsi/csi-provisioner:v0.3.0
Image ID: docker-pullable://quay.io/k8scsi/csi-provisioner@sha256:d45e03c39c1308067fd46d69d8e01475cc0c9944c897f6eded4df07e75e5d3fb
Port: <none>
Host Port: <none>
Args:
--v=5
--provisioner=io.ember-csi
--csi-address=/csi-data/csi.sock
State: Running
Started: Mon, 10 Sep 2018 12:44:44 -0400
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/csi-data from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from ember-csi-controller-sa-token-hb2zz (ro)
external-attacher:
Container ID: docker://e265819115d3012ec959ef8e89796b415e35d1b1e2b55b3d75d93db0bf870910
Image: quay.io/k8scsi/csi-attacher:v0.3.0
Image ID: docker-pullable://quay.io/k8scsi/csi-attacher@sha256:44b7d518e00d437fed9bdd6e37d3a9dc5c88ca7fc096ed2ab3af9d3600e4c790
Port: <none>
Host Port: <none>
Args:
--v=5
--csi-address=/csi-data/csi.sock
State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 10 Sep 2018 12:46:57 -0400
Finished: Mon, 10 Sep 2018 12:47:57 -0400
Ready: False
Restart Count: 2
Environment: <none>
Mounts:
/csi-data from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from ember-csi-controller-sa-token-hb2zz (ro)
csi-driver:
Container ID: docker://962b98ee798f6b16d90bb78192dd1819d22115935e157a03791939beb9d6587d
Image: akrog/ember-csi:master
Image ID: docker-pullable://docker.io/akrog/ember-csi@sha256:331881bdb41d8004ca7960e730c495c6151dabbca2153d3f40266f62f9943c1d
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 10 Sep 2018 12:47:12 -0400
Finished: Mon, 10 Sep 2018 12:47:44 -0400
Ready: False
Restart Count: 3
Environment:
PYTHONUNBUFFERED: 0
CSI_ENDPOINT: unix:///csi-data/csi.sock
KUBE_NODE_NAME: (v1:spec.nodeName)
CSI_MODE: controller
X_CSI_PERSISTENCE_CONFIG: {"storage":"crd"}
X_CSI_BACKEND_CONFIG: {"volume_backend_name": "rbd", "volume_driver": "cinder.volume.drivers.rbd.RBDDriver", "rbd_user": "cinder", "rbd_pool": "cinder_volumes", "rbd_ceph_conf": "/etc/ceph/ceph.conf", "rbd_keyring_conf": "/etc/ceph/keyring"}
Mounts:
/csi-data from socket-dir (rw)
/dev from dev-dir (rw)
/etc/ceph from ceph-secrets (rw)
/etc/iscsi from iscsi-dir (rw)
/etc/localtime from localtime (rw)
/etc/lvm from lvm-dir (rw)
/etc/multipath from multipath-dir (rw)
/etc/multipath.conf from multipath-conf (rw)
/lib/modules from modules-dir (rw)
/run/udev from run-dir (rw)
/var/lock/lvm from lvm-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from ember-csi-controller-sa-token-hb2zz (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
socket-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
iscsi-dir:
Type: HostPath (bare host directory volume)
Path: /etc/iscsi
HostPathType: Directory
run-dir:
Type: HostPath (bare host directory volume)
Path: /run/udev
HostPathType:
dev-dir:
Type: HostPath (bare host directory volume)
Path: /dev
HostPathType:
lvm-dir:
Type: HostPath (bare host directory volume)
Path: /etc/lvm
HostPathType: Directory
lvm-lock:
Type: HostPath (bare host directory volume)
Path: /var/lock/lvm
HostPathType:
multipath-dir:
Type: HostPath (bare host directory volume)
Path: /etc/multipath
HostPathType:
multipath-conf:
Type: HostPath (bare host directory volume)
Path: /etc/multipath.conf
HostPathType:
modules-dir:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
localtime:
Type: HostPath (bare host directory volume)
Path: /etc/localtime
HostPathType:
ceph-secrets:
Type: Secret (a volume populated by a Secret)
SecretName: ceph-secrets
Optional: false
ember-csi-controller-sa-token-hb2zz:
Type: Secret (a volume populated by a Secret)
SecretName: ember-csi-controller-sa-token-hb2zz
Optional: false
QoS Class: BestEffort
Node-Selectors: node-role.kubernetes.io/compute=true
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m default-scheduler Successfully assigned ember-csi/csi-controller-0 to qe-shlao-ceph-openstack-node-1
Normal Created 3m kubelet, qe-shlao-ceph-openstack-node-1 Created container
Normal Started 3m kubelet, qe-shlao-ceph-openstack-node-1 Started container
Normal Pulled 3m kubelet, qe-shlao-ceph-openstack-node-1 Container image "quay.io/k8scsi/csi-provisioner:v0.3.0" already present on machine
Normal Pulled 2m (x2 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 Container image "quay.io/k8scsi/csi-attacher:v0.3.0" already present on machine
Normal Created 2m (x2 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 Created container
Normal Started 2m (x2 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 Started container
Normal Pulling 1m (x3 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 pulling image "akrog/ember-csi:master"
Normal Pulled 1m (x3 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 Successfully pulled image "akrog/ember-csi:master"
Normal Created 1m (x3 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 Created container
Normal Started 1m (x3 over 3m) kubelet, qe-shlao-ceph-openstack-node-1 Started container
Warning BackOff 1m (x3 over 2m) kubelet, qe-shlao-ceph-openstack-node-1 Back-off restarting failed container
Warning BackOff 1m kubelet, qe-shlao-ceph-openstack-node-1 Back-off restarting failed container
[root@qe-shlao-ceph-openstack-master-etcd-1 openshift]# oc logs pod/csi-controller-0 -c external-attacher
I0910 16:46:57.265147 1 main.go:74] Version: v0.3.0-1-g76ebff7
I0910 16:46:57.266518 1 connection.go:88] Connecting to /csi-data/csi.sock
I0910 16:46:57.266743 1 connection.go:115] Still trying, connection is CONNECTING
I0910 16:46:57.266920 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:46:58.267125 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:46:59.309251 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:00.485537 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:01.551533 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:02.526749 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:03.496711 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:04.571547 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:05.397918 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:06.260679 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:07.099535 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
I0910 16:47:08.020025 1 connection.go:115] Still trying, connection is TRANSIENT_FAILURE
[root@qe-shlao-ceph-openstack-master-etcd-1 openshift]# oc logs pod/csi-controller-0 -c csi-driver
X_CSI_SYSTEM_FILES not specified.
Starting Ember CSI v0.0.2 in controller only mode (cinderlib: v0.2.2, cinder: v13.0.0.0rc2.dev74, CSI spec: v0.2.0)
There are no automated tests of any kind, so everything has to be tested manually.
This is cumbersome, considering the number of different drivers supported by this plugin.
While launching cinderlib-csi in "node" mode, the following error shows up:
[kthyagar@kt-c7kb7 cinderlib-csi{master}]$ python cinderlib_csi/cinderlib_csi.py
Starting cinderlib CSI v0.0.2 in node only mode (cinderlib: v0.2.1, cinder: v11.1.1, CSI spec: v0.2.0)
Traceback (most recent call last):
File "cinderlib_csi/cinderlib_csi.py", line 1005, in
main()
File "cinderlib_csi/cinderlib_csi.py", line 982, in main
node_id=node_id)
TypeError: init() got multiple values for keyword argument 'node_id'
[kthyagar@kt-c7kb7 cinderlib-csi{master}]$ env | grep CSI
X_CSI_BACKEND_CONFIG={"volume_backend_name": "rbd", "volume_driver": "cinder.volume.drivers.rbd.RBDDriver", "rbd_user": "cinder", "rbd_pool": "cinder_volumes", "rbd_ceph_conf": "/etc/ceph/ceph.conf", "rbd_keyring_conf": "/etc/ceph/ceph.client.cinder.keyring"}
CSI_MODE=node
In my local environment, I have gotten past this error, only to encounter many other issues deep within cinderlib itself.
Please let me know if you need any further info to help debug this.
We should make the best of CSI volume parameters.
We can use them to specify the pool in a multi-pool driver, set extra-specs, and even set QoS settings.
The backend storage crendentials are currently listed in the csi driver manifest, we should use ConfigMaps to store them securely and separate that data from the manifest itself...
Ember-CSI could support encryption using LUKS since most of the plumbing is already there in OS-Brick, so we just need to add some code to Ember-CSI to support this through the CSI spec.
iSCSI attach/detach speed is not optimal due to the presence of a big critical section in the os-brick code.
This section can be reduced using a combination of exclusive and shared locking on the same file lock.
The work should be done in upstream os-brick and then get it here once it is released, but if it takes a while to get merged we could consider carrying the patch downstream in Ember-CSI containers until it does.
We are reporting the Ember-CSI version in GetPluginCapabilities
response under the field vendor_version
, which is not good enough to know when an individual component has changed.
We could have the same Ember-CSI version but have updated the cinder, os-brick, or cinderlib version, and the plugin would not be reporting this change to the CO.
For upgrades we'll need a finer grained reporting capability from the plugin, as we'll need to have multiple plugin versions running simultaneously while we update/upgrade, and according to the spec this is acceptable as long as the vendor_version
is different:
All instances of the same version (see
vendor_version
ofGetPluginInfoResponse
) of the Plugin SHALL return the same set of capabilities, regardless of both: (a) where instances are deployed on the cluster as well as; (b) which RPCs an instance is serving.
One way to do this would be concatenating the Ember-CSI version with the OpenStack release and the date.
We currently have stage and unstage calls for convenience, but we could remove them to get a bit of a performance increase, as there would be fewer grpc calls.
Now that cinderlib supports rbd-nbd to attach RBD volumes we should include it in our containers, as it'd be the preferred connection tool for RBD volumes to avoid feature support mismatch between the kernel module and the Ceph cluster.
Right now we only support a single controller, deployed as a StatefulSet
. It would be good to support multiple controller plugins to increase the reliability and hopefully the performance of the plugin.
We are doing soft-deletes when deleting a volume that has snapshots, but
when later we deleted the snapshots the soft-deleted volume was not
being deleted.
The reason was that we were not correctly setting the status of the
volume to 'deleted', because we were creating a new status attribute in
the cinderlib Volume object instead of changing the OVO's status
attribute.
Since the OVO's attributes are the only ones that get serialized, we
ended up saving the same status we had before and not the "deleted" one
we wanted.
Google's GRPC python library doesn't work with eventlet out of the box.
Ember-CSI has a workaround for it for non stream connections.
It would be best to create a proper fix for the grpc Python library to add support for eventlet and remove the workaround from our code, as this would ensure that new versions of the library would not break us.
The code has some code meant to handle the 3 different modes (Controller, Node, Controller & Node) via the CSI_MODE
environmental variable.
Only the Controller & Node joined functionality has been tested, so it's very likely that the other modes are not working as expected.
Some of the configuration parameters name within a backend configuration are unnecessarily long, for example volume_backend_name
, volume_driver
, and use_multipath_for_image_xfer
.
It would be convenient if Ember could abstract the real cinderlib names and simplify them for the users.
Also, the volume driver names are way too long since they need to specify the whole driver namespace. It would be great if we could have aliases.
Listing volumes when running with CSI v0.3 will raise an exception:
2019-03-05 13:02:50 ERROR root [req-140029023167384] Exception calling application: 'All' object has no attribute 'TOPOLOGIES': AttributeError: 'All' object has no attribute 'TOPOLOGIES'
2019-03-05 13:02:50.626 27782 ERROR root Traceback (most recent call last):
2019-03-05 13:02:50.626 27782 ERROR root File "/home/geguileo/code/reuse-cinder-drivers/ember/.tox/py27/lib/python2.7/site-packages/grpc/_server.py", line 385, in _call_behavior
2019-03-05 13:02:50.626 27782 ERROR root return behavior(argument, context), True
2019-03-05 13:02:50.626 27782 ERROR root File "/home/geguileo/code/reuse-cinder-drivers/ember/ember_csi/common.py", line 124, in dolog
2019-03-05 13:02:50.626 27782 ERROR root result = f(self, request, context)
2019-03-05 13:02:50.626 27782 ERROR root File "/home/geguileo/code/reuse-cinder-drivers/ember/ember_csi/common.py", line 219, in checker
2019-03-05 13:02:50.626 27782 ERROR root return f(self, request, context)
2019-03-05 13:02:50.626 27782 ERROR root File "/home/geguileo/code/reuse-cinder-drivers/ember/ember_csi/common.py", line 76, in wrapper
2019-03-05 13:02:50.626 27782 ERROR root return func(self, request, context)
2019-03-05 13:02:50.626 27782 ERROR root File "/home/geguileo/code/reuse-cinder-drivers/ember/ember_csi/base.py", line 352, in CreateVolume
2019-03-05 13:02:50.626 27782 ERROR root volume = self._convert_volume_type(vol)
2019-03-05 13:02:50.626 27782 ERROR root File "/home/geguileo/code/reuse-cinder-drivers/ember/ember_csi/v0_3_0/csi.py", line 83, in _convert_volume_type
2019-03-05 13:02:50.626 27782 ERROR root if self.TOPOLOGIES:
2019-03-05 13:02:50.626 27782 ERROR root AttributeError: 'All' object has no attribute 'TOPOLOGIES'
The latest container image (ea711db5e3da) is broken. Eg:
$ docker run -it embercsi/ember-csi:latest /usr/bin/ember-csi
Traceback (most recent call last):
File "/usr/bin/ember-csi", line 6, in
from pkg_resources import load_entry_point
File "/usr/lib/python2.7/site-packages/pkg_resources/init.py", line 3126, in
@_call_aside
File "/usr/lib/python2.7/site-packages/pkg_resources/init.py", line 3110, in _call_aside
f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/pkg_resources/init.py", line 3139, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/usr/lib/python2.7/site-packages/pkg_resources/init.py", line 583, in _build_master
return cls._build_from_requirements(requires)
File "/usr/lib/python2.7/site-packages/pkg_resources/init.py", line 596, in _build_from_requirements
dists = ws.resolve(reqs, Environment())
File "/usr/lib/python2.7/site-packages/pkg_resources/init.py", line 789, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (urllib3 1.24.1 (/usr/lib/python2.7/site-packages), Requirement.parse('urllib3<1.24,>=1.21.1'), set(['requests']))
We have the number of gRPC workers used by the server hardcoded to 10, we should be able to change it.
[root@kt-c7kb7 kirankt]# oc create -f pvc.yml
persistentvolumeclaim/ember-csi-pvc created
[root@kt-c7kb7 kirankt]# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ember-csi-pvc Bound pvc-077174ec-ab36-11e8-a917-0017a4772406 1Gi RWO ember-csi-sc 6s
[root@kt-c7kb7 kirankt]# oc logs pod/ember-csi-node-z9qtr -c ember-csi-driver | less
X_CSI_SYSTEM_FILES not specified.
Starting Ember CSI v0.0.2 in node only mode (cinderlib: v0.2.2, cinder: v13.0.0.0rc2.dev46, CSI spec: v0.2.0)
Supported filesystems are: cramfs, minix, xfs, btrfs, ext2, ext3, ext4
Running as node
Debugging feature is ENABLED with ember_csi.rpdb and OFF. Toggle it with SIGUSR1.
Now serving on unix:///csi-data/csi.sock...
=> 2018-08-29 02:48:12.582640 GRPC [139733851126792]: GetPluginInfo without params
<= 2018-08-29 02:48:12.582699 GRPC in 0s [139733851126792]: GetPluginInfo returns
name: "io.ember-csi"
vendor_version: "0.0.2"
manifest {
key: "cinder-version"
value: "13.0.0.0rc2.dev46"
}
manifest {
key: "cinderlib-version"
value: "0.2.2"
}
manifest {
key: "mode"
value: "node"
}
manifest {
key: "persistence"
value: "CRDPersistence"
}
=> 2018-08-29 02:48:12.593167 GRPC [139733851127632]: NodeGetId without params
<= 2018-08-29 02:48:12.593220 GRPC in 0s [139733851127632]: NodeGetId returns
node_id: "10.19.139.83"
[root@kt-c7kb7 kirankt]# oc create -f app.yml
pod/my-csi-app created
[root@kt-c7kb7 kirankt]# oc logs pod/ember-csi-node-z9qtr -c ember-csi-driver-f
Error from server (BadRequest): container ember-csi-driver-f is not valid for pod ember-csi-node-z9qtr
[root@kt-c7kb7 kirankt]# oc logs pod/ember-csi-node-z9qtr -c ember-csi-driver -f
X_CSI_SYSTEM_FILES not specified.
Starting Ember CSI v0.0.2 in node only mode (cinderlib: v0.2.2, cinder: v13.0.0.0rc2.dev46, CSI spec: v0.2.0)
Supported filesystems are: cramfs, minix, xfs, btrfs, ext2, ext3, ext4
Running as node
Debugging feature is ENABLED with ember_csi.rpdb and OFF. Toggle it with SIGUSR1.
Now serving on unix:///csi-data/csi.sock...
=> 2018-08-29 02:48:12.582640 GRPC [139733851126792]: GetPluginInfo without params
<= 2018-08-29 02:48:12.582699 GRPC in 0s [139733851126792]: GetPluginInfo returns
name: "io.ember-csi"
vendor_version: "0.0.2"
manifest {
key: "cinder-version"
value: "13.0.0.0rc2.dev46"
}
manifest {
key: "cinderlib-version"
value: "0.2.2"
}
manifest {
key: "mode"
value: "node"
}
manifest {
key: "persistence"
value: "CRDPersistence"
}
=> 2018-08-29 02:48:12.593167 GRPC [139733851127632]: NodeGetId without params
<= 2018-08-29 02:48:12.593220 GRPC in 0s [139733851127632]: NodeGetId returns
node_id: "10.19.139.83"
=> 2018-08-29 02:49:04.294972 GRPC [139733851127632]: NodeGetCapabilities without params
<= 2018-08-29 02:49:04.295237 GRPC in 0s [139733851127632]: NodeGetCapabilities returns
capabilities {
rpc {
type: STAGE_UNSTAGE_VOLUME
}
}
=> 2018-08-29 02:49:04.299448 GRPC [139733814980688]: NodeStageVolume with params
volume_id: "93d59676-93bc-4b7a-95ee-93e4ccba0e67"
publish_info {
key: "connection_info"
value: "{\"connector\": {\"initiator\": \"iqn.1994-05.com.redhat:b2a7cfc30af\", \"ip\": \"10.19.139.83\", \"platform\": \"x86_64\", \"host\": \"kt-c7kb9.cloud.lab.eng.bos.redhat.com\", \"do_local_attach\": false, \"os_type\": \"linux2\", \"multipath\": false}, \"conn\": {\"driver_volume_type\": \"rbd\", \"data\": {\"secret_uuid\": null, \"volume_id\": \"93d59676-93bc-4b7a-95ee-93e4ccba0e67\", \"auth_username\": \"cinder\", \"secret_type\": \"ceph\", \"name\": \"cinder_volumes/volume-93d59676-93bc-4b7a-95ee-93e4ccba0e67\", \"discard\": true, \"keyring\": \"[client.cinder]\\n\\tkey = AQD2o5RalmhvMhAApYsRfGUfL1A1m0aXgQsaLw==\\n\", \"cluster_name\": \"ceph\", \"hosts\": [\"172.31.142.11\", \"172.31.142.12\", \"172.31.142.13\"], \"auth_enabled\": true, \"ports\": [\"6789\", \"6789\", \"6789\"]}}}"
}
staging_target_path: "/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/csi/pv/pvc-077174ec-ab36-11e8-a917-0017a4772406/globalmount"
volume_capability {
mount {
fs_type: "ext4"
}
access_mode {
mode: SINGLE_NODE_WRITER
}
}
volume_attributes {
key: "storage.kubernetes.io/csiProvisionerIdentity"
value: "1535510864943-8081-io.ember-csi"
}
!! 2018-08-29 02:49:04.831390 GRPC in 1s [139733814980688]: Unexpected exception on NodeStageVolume ()
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 128, in dolog
result = f(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 169, in checker
return f(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 211, in wrapper
return func(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 858, in NodeStageVolume
conn.attach()
File "/usr/lib/python2.7/site-packages/cinderlib/objects.py", line 715, in attach
device = self.connector.connect_volume(self.conn_info['data'])
File "/usr/lib/python2.7/site-packages/nos_brick/__init__.py", line 32, in connect_volume
self._execute('which', 'rbd')
File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute
result = self.__execute(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 143, in custom_execute
on_completion=on_completion, *cmd, **kwargs)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 391, in execute
env=env_variables)
File "/usr/lib/python2.7/site-packages/eventlet/green/subprocess.py", line 58, in __init__
subprocess_orig.Popen.__init__(self, args, 0, *argss, **kwds)
File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
=> 2018-08-29 02:49:05.402198 GRPC [139733813464912]: NodeGetCapabilities without params
<= 2018-08-29 02:49:05.402368 GRPC in 0s [139733813464912]: NodeGetCapabilities returns
capabilities {
rpc {
type: STAGE_UNSTAGE_VOLUME
}
}
=> 2018-08-29 02:49:05.408270 GRPC [139733813538896]: NodeStageVolume with params
volume_id: "93d59676-93bc-4b7a-95ee-93e4ccba0e67"
publish_info {
key: "connection_info"
value: "{\"connector\": {\"initiator\": \"iqn.1994-05.com.redhat:b2a7cfc30af\", \"ip\": \"10.19.139.83\", \"platform\": \"x86_64\", \"host\": \"kt-c7kb9.cloud.lab.eng.bos.redhat.com\", \"do_local_attach\": false, \"os_type\": \"linux2\", \"multipath\": false}, \"conn\": {\"driver_volume_type\": \"rbd\", \"data\": {\"secret_uuid\": null, \"volume_id\": \"93d59676-93bc-4b7a-95ee-93e4ccba0e67\", \"auth_username\": \"cinder\", \"secret_type\": \"ceph\", \"name\": \"cinder_volumes/volume-93d59676-93bc-4b7a-95ee-93e4ccba0e67\", \"discard\": true, \"keyring\": \"[client.cinder]\\n\\tkey = AQD2o5RalmhvMhAApYsRfGUfL1A1m0aXgQsaLw==\\n\", \"cluster_name\": \"ceph\", \"hosts\": [\"172.31.142.11\", \"172.31.142.12\", \"172.31.142.13\"], \"auth_enabled\": true, \"ports\": [\"6789\", \"6789\", \"6789\"]}}}"
}
staging_target_path: "/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/csi/pv/pvc-077174ec-ab36-11e8-a917-0017a4772406/globalmount"
volume_capability {
mount {
fs_type: "ext4"
}
access_mode {
mode: SINGLE_NODE_WRITER
}
}
volume_attributes {
key: "storage.kubernetes.io/csiProvisionerIdentity"
value: "1535510864943-8081-io.ember-csi"
}
!! 2018-08-29 02:49:05.939252 GRPC in 1s [139733813538896]: Unexpected exception on NodeStageVolume ()
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 128, in dolog
result = f(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 169, in checker
return f(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 211, in wrapper
return func(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/ember_csi.py", line 858, in NodeStageVolume
conn.attach()
File "/usr/lib/python2.7/site-packages/cinderlib/objects.py", line 715, in attach
device = self.connector.connect_volume(self.conn_info['data'])
File "/usr/lib/python2.7/site-packages/nos_brick/__init__.py", line 32, in connect_volume
self._execute('which', 'rbd')
File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute
result = self.__execute(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 143, in custom_execute
on_completion=on_completion, *cmd, **kwargs)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 391, in execute
env=env_variables)
File "/usr/lib/python2.7/site-packages/eventlet/green/subprocess.py", line 58, in __init__
subprocess_orig.Popen.__init__(self, args, 0, *argss, **kwds)
File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
=> 2018-08-29 02:49:07.009324 GRPC [139733814981168]: NodeGetCapabilities without params
<= 2018-08-29 02:49:07.009471 GRPC in 0s [139733814981168]: NodeGetCapabilities returns
capabilities {
rpc {
type: STAGE_UNSTAGE_VOLUME
}
}
=> 2018-08-29 02:49:07.012987 GRPC [139733814981288]: NodeStageVolume with params
The README file, being the only documentation of the project, is really bloated.
It should be simplified, and there should be proper readthedocs documentation.
We should use Kubernetes end to end storage tests to verify that the plugin works and make it run in our CI.
Hi,
I am trying to deploy the Ember CSI driver as an all-in-one deployment largely based on the examples/kubevirt/csi.yml template. The driver seems to start up fine but the PVCs never get mounted. Peeking into the driver logs, I do not see any stage/unstage activity. Here is what I see:
[root@kt-c7kb7 openshift]# oc logs pod/ember-csi-aio-pod -c ember-driver -f
X_CSI_SYSTEM_FILES not specified.
Starting Ember CSI v0.0.2 (cinderlib: v0.2.2, cinder: v13.0.0.0rc2.dev59, CSI spec: v0.2.0)
Supported filesystems are: cramfs, minix, xfs, btrfs, ext2, ext3, ext4
Running as all with backend RBDDriver v1.2.0
Debugging feature is ENABLED with ember_csi.rpdb and OFF. Toggle it with SIGUSR1.
Now serving on unix:///csi-data/csi.sock...
=> 2018-08-31 02:02:10.670840 GRPC [140641126232144]: GetPluginInfo without params
<= 2018-08-31 02:02:10.670919 GRPC in 0s [140641126232144]: GetPluginInfo returns
name: "io.ember-csi"
vendor_version: "0.0.2"
manifest {
key: "cinder-driver"
value: "RBDDriver"
}
manifest {
key: "cinder-driver-supported"
value: "True"
}
manifest {
key: "cinder-driver-version"
value: "1.2.0"
}
manifest {
key: "cinder-version"
value: "13.0.0.0rc2.dev59"
}
manifest {
key: "cinderlib-version"
value: "0.2.2"
}
manifest {
key: "mode"
value: "all"
}
manifest {
key: "persistence"
value: "CRDPersistence"
}
=> 2018-08-31 02:02:10.680554 GRPC [140641126244552]: NodeGetId without params
<= 2018-08-31 02:02:10.680600 GRPC in 0s [140641126244552]: NodeGetId returns
node_id: "kt-c7kb9.cloud.lab.eng.bos.redhat.com"
=> 2018-08-31 02:02:10.879080 GRPC [140641126244792]: GetPluginInfo without params
<= 2018-08-31 02:02:10.879124 GRPC in 0s [140641126244792]: GetPluginInfo returns
name: "io.ember-csi"
vendor_version: "0.0.2"
manifest {
key: "cinder-driver"
value: "RBDDriver"
}
manifest {
key: "cinder-driver-supported"
value: "True"
}
manifest {
key: "cinder-driver-version"
value: "1.2.0"
}
manifest {
key: "cinder-version"
value: "13.0.0.0rc2.dev59"
}
manifest {
key: "cinderlib-version"
value: "0.2.2"
}
manifest {
key: "mode"
value: "all"
}
manifest {
key: "persistence"
value: "CRDPersistence"
}
=> 2018-08-31 02:02:10.881839 GRPC [140641126244912]: Probe without params
<= 2018-08-31 02:02:10.881926 GRPC in 0s [140641126244912]: Probe returns nothing
=> 2018-08-31 02:02:10.884483 GRPC [140641126245032]: GetPluginCapabilities without params
<= 2018-08-31 02:02:10.884548 GRPC in 0s [140641126245032]: GetPluginCapabilities returns
capabilities {
service {
type: CONTROLLER_SERVICE
}
}
=> 2018-08-31 02:02:10.886930 GRPC [140641126245152]: ControllerGetCapabilities without params
<= 2018-08-31 02:02:10.887150 GRPC in 0s [140641126245152]: ControllerGetCapabilities returns
capabilities {
rpc {
type: CREATE_DELETE_VOLUME
}
}
capabilities {
rpc {
type: PUBLISH_UNPUBLISH_VOLUME
}
}
capabilities {
rpc {
type: LIST_VOLUMES
}
}
capabilities {
rpc {
type: GET_CAPACITY
}
}
=> 2018-08-31 02:02:16.133739 GRPC [140641126245272]: GetPluginCapabilities without params
<= 2018-08-31 02:02:16.133782 GRPC in 0s [140641126245272]: GetPluginCapabilities returns
capabilities {
service {
type: CONTROLLER_SERVICE
}
}
=> 2018-08-31 02:02:16.138038 GRPC [140641126246352]: ControllerGetCapabilities without params
<= 2018-08-31 02:02:16.138178 GRPC in 0s [140641126246352]: ControllerGetCapabilities returns
capabilities {
rpc {
type: CREATE_DELETE_VOLUME
}
}
capabilities {
rpc {
type: PUBLISH_UNPUBLISH_VOLUME
}
}
capabilities {
rpc {
type: LIST_VOLUMES
}
}
capabilities {
rpc {
type: GET_CAPACITY
}
}
=> 2018-08-31 02:02:16.142287 GRPC [140641126116360]: GetPluginInfo without params
<= 2018-08-31 02:02:16.142336 GRPC in 0s [140641126116360]: GetPluginInfo returns
name: "io.ember-csi"
vendor_version: "0.0.2"
manifest {
key: "cinder-driver"
value: "RBDDriver"
}
manifest {
key: "cinder-driver-supported"
value: "True"
}
manifest {
key: "cinder-driver-version"
value: "1.2.0"
}
manifest {
key: "cinder-version"
value: "13.0.0.0rc2.dev59"
}
manifest {
key: "cinderlib-version"
value: "0.2.2"
}
manifest {
key: "mode"
value: "all"
}
manifest {
key: "persistence"
value: "CRDPersistence"
}
=> 2018-08-31 02:02:16.147523 GRPC [140641126116960]: CreateVolume with params
name: "pvc-e2535f29acc111e8"
capacity_range {
required_bytes: 1073741824
}
volume_capabilities {
mount {
}
access_mode {
mode: SINGLE_NODE_WRITER
}
}
creating volume
<= 2018-08-31 02:02:16.255768 GRPC in 0s [140641126116960]: CreateVolume returns
volume {
capacity_bytes: 1073741824
id: "d228f642-aec3-4af0-a0a5-b392fd91a9e2"
}
=> 2018-08-31 02:02:30.073385 GRPC [140641126116120]: ControllerPublishVolume with params
volume_id: "d228f642-aec3-4af0-a0a5-b392fd91a9e2"
node_id: "kt-c7kb9.cloud.lab.eng.bos.redhat.com"
volume_capability {
mount {
fs_type: "ext4"
}
access_mode {
mode: SINGLE_NODE_WRITER
}
}
volume_attributes {
key: "storage.kubernetes.io/csiProvisionerIdentity"
value: "1535680902823-8081-io.ember-csi.aio"
}
<= 2018-08-31 02:02:30.914675 GRPC in 1s [140641126116120]: ControllerPublishVolume returns
publish_info {
key: "connection_info"
value: "{\"connector\": {\"initiator\": \"iqn.1994-05.com.redhat:b2a7cfc30af\", \"ip\": \"10.19.139.83\", \"platform\": \"x86_64\", \"host\": \"kt-c7kb9.cloud.lab.eng.bos.redhat.com\", \"do_local_attach\": false, \"os_type\": \"linux2\", \"multipath\": false}, \"conn\": {\"driver_volume_type\": \"rbd\", \"data\": {\"secret_uuid\": null, \"volume_id\": \"d228f642-aec3-4af0-a0a5-b392fd91a9e2\", \"auth_username\": \"cinder\", \"secret_type\": \"ceph\", \"name\": \"cinder_volumes/volume-d228f642-aec3-4af0-a0a5-b392fd91a9e2\", \"discard\": true, \"keyring\": \"[client.cinder]\\n\\tkey = AQD2o5RalmhvMhAApYsRfGUfL1A1m0aXgQsaLw==\\n\", \"cluster_name\": \"ceph\", \"hosts\": [\"172.31.142.11\", \"172.31.142.12\", \"172.31.142.13\"], \"auth_enabled\": true, \"ports\": [\"6789\", \"6789\", \"6789\"]}}}"
}
^C
[root@kt-c7kb7 openshift]# oc get all
NAME READY STATUS RESTARTS AGE
pod/ember-csi-aio-pod 4/4 Running 0 3m
pod/my-csi-app 0/1 ContainerCreating 0 2m
We can now publish a volume for RW and then publish it again for RO, which should not be allowed, as they are incompatible.
On commit 5021ca5 we added more sensible directory defaults for Ember-CSI as it'd be use in containers, but the way it was done wouldn't support changing it just by change the state_path parameter.
There is no document on how to contribute to the project, not even guidelines on preferred commit message format.
Ember-CSI volume listing doesn't work because we are not using the right gRPC types when returning the data.
When running list-volumes using CSC we see:
$ kubectl exec -c csc csi-controller-0 csc controller list-volumes
Exception calling application: Parameter to MergeFrom() must be instance of same class: expected csi.v1.ListVolumesResponse.Entry got csi.v1.CreateVolumeResponse.
Kubernetes example in the repository deploys CSI nodes in such a way that a second attachment of the PV will format the volume again, losing the data.
This happens because the lsblk
command doesn't return the filesystem of the block device on the second attach, so the system thinks it's empty and formats it.
lsblk
uses udev data to report the filesystem type, so we need to mount /dev/udev
for it to be able to report the filesystem type.
We should support multi-attach for block volumes even if we cannot support it for mount volumes.
[ This is not a duplicate of #100 ]
There is not enough "randomization" in the request ID generation, so we may end up seeing the same request ID in different request in a short period of time.
This happens even if we are sending requests serially.
When we enable the probe passing "enable_probe":true
in the X_CSI_EMBER_CONFIG
the gRPC Probe call fail ratio skyrockets.
This seems to be caused by a collision on the name of the key-value CRD instance used to do the check for the Kubernetes access, as all nodes from the same backend are using the same name.
If CSI is deployed on a clean system it works fine, but if there is another app that creates CRDs e.g kubevirt, deploying the CSI driver after the fact causes crashes:
The log in the CSI driver container is:
--
Starting Ember CSI v0.0.2 in node only mode (cinderlib: v0.2.2.dev0, cinder: v11.1.1, CSI spec: v0.2.0)
Traceback (most recent call last):
File "/usr/bin/ember-csi", line 11, in <module>
load_entry_point('ember-csi', 'console_scripts', 'ember-csi')()
File "/csi/ember_csi/ember_csi.py", line 1073, in main
node_id=node_id)
File "/csi/ember_csi/ember_csi.py", line 736, in __init__
**cinderlib_config)
File "/cinderlib/cinderlib/cinderlib.py", line 155, in global_setup
cls.set_persistence(persistence_config)
File "/cinderlib/cinderlib/cinderlib.py", line 133, in set_persistence
cls.persistence = persistence.setup(persistence_config)
File "/cinderlib/cinderlib/persistence/__init__.py", line 83, in setup
invoke_kwds=config,
File "/usr/lib/python2.7/site-packages/stevedore/driver.py", line 61, in __init__
warn_on_missing_entrypoint=warn_on_missing_entrypoint
File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 81, in __init__
verify_requirements)
File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 194, in _load_plugins
self._on_load_failure_callback(self, ep, err)
File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 186, in _load_plugins
verify_requirements,
File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 158, in _load_one_plugin
verify_requirements,
File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 218, in _load_one_plugin
obj = plugin(*invoke_args, **invoke_kwds)
File "/csi/ember_csi/cl_crd.py", line 365, in __init__
CRD.ensure_crds_exist()
File "/csi/ember_csi/cl_crd.py", line 85, in ensure_crds_exist
crds = K8S.ext_api.list_custom_resource_definition().to_dict()['items']
File "/usr/lib/python2.7/site-packages/kubernetes/client/apis/apiextensions_v1beta1_api.py", line 496, in list_custom_resource_definition
(data) = self.list_custom_resource_definition_with_http_info(**kwargs)
File "/usr/lib/python2.7/site-packages/kubernetes/client/apis/apiextensions_v1beta1_api.py", line 593, in list_custom_resource_definition_with_http_info
collection_formats=collection_formats)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 321, in call_api
_return_http_data_only, collection_formats, _preload_content, _request_timeout)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 163, in __call_api
return_data = self.deserialize(response_data, response_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 236, in deserialize
return self.__deserialize(data, response_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 254, in __deserialize
for sub_data in data]
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in __deserialize
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in <dictcomp>
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in __deserialize
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in <dictcomp>
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in __deserialize
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in <dictcomp>
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in __deserialize
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 259, in <dictcomp>
for k, v in iteritems(data)}
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 620, in __deserialize_model
kwargs[attr] = self.__deserialize(value, attr_type)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
return self.__deserialize_model(data, klass)
File "/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py", line 622, in __deserialize_model
instance = klass(**kwargs)
File "/usr/lib/python2.7/site-packages/kubernetes/client/models/v1beta1_json_schema_props_or_array.py", line 52, in __init__
self.json_schemas = json_schemas
File "/usr/lib/python2.7/site-packages/kubernetes/client/models/v1beta1_json_schema_props_or_array.py", line 74, in json_schemas
raise ValueError("Invalid value for `json_schemas`, must not be `None`")
ValueError: Invalid value for `json_schemas`, must not be `None`
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/util.py", line 268, in _run_finalizers
finalizer()
File "/usr/lib64/python2.7/multiprocessing/util.py", line 201, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 482, in _terminate_pool
cls._help_stuff_finish(inqueue, task_handler, len(pool))
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 725, in _help_stuff_finish
inqueue.not_empty.acquire()
AttributeError: 'Queue' object has no attribute 'not_empty'
```--
Steps to reproduce:
oc apply -f https://github.com/kubevirt/kubevirt/releases/download/v0.7.0/kubevirt.yaml
oc apply -f csi.yml
oc logs -f pod/csi-node-6rxqb -c csi-driver
We are returning a false value for Snapshots, as we are returning the volume size, which in general will be greater than the actual snapshot size.
Since the CSI spec allows us to return an undefined value we should do so, since we can't tell the real size via cinderlib.
Ember-CSI is setup to use CSI spec V1 and the snapshotter sidecar has been deployed, and we can create snapshots, but when we try to create a volume from snapshot we get the following error on the logs:
2019-01-29 13:02:13 INFO ember_csi.common [req-140230550950928] => GRPC CreateVolume pvc-d8750d67-23c5-11e9-a1c2-5254000952e0
2019-01-29 13:02:13 ERROR ember_csi.common [req-140230550950928] !! GRPC CreateVolume failed in 0s with Unexpected exception (Protocol message Volume has no "volume_content_source" field.)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 123, in dolog
result = f(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 218, in checker
return f(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 75, in wrapper
return func(self, request, context)
File "/usr/lib/python2.7/site-packages/ember_csi/base.py", line 351, in CreateVolume
return self._convert_volume_type(vol)
File "/usr/lib/python2.7/site-packages/ember_csi/v1_0_0/csi.py", line 114, in _convert_volume_type
volume = types.Volume(**parameters)
ValueError: Protocol message Volume has no "volume_content_source" field.
2019-01-29 13:02:13 ERROR root [req-140230550950928] Exception calling application: Protocol message Volume has no "volume_content_source" field.: ValueError: Protocol message Volume has no "volume_content_source" field.
2019-01-29 13:02:13.977 1 ERROR root Traceback (most recent call last):
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib64/python2.7/site-packages/grpc/_server.py", line 385, in _call_behavior
2019-01-29 13:02:13.977 1 ERROR root return behavior(argument, context), True
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 159, in wrapper
2019-01-29 13:02:13.977 1 ERROR root return f(*args, **kwargs)
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 123, in dolog
2019-01-29 13:02:13.977 1 ERROR root result = f(self, request, context)
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 218, in checker
2019-01-29 13:02:13.977 1 ERROR root return f(self, request, context)
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib/python2.7/site-packages/ember_csi/common.py", line 75, in wrapper
2019-01-29 13:02:13.977 1 ERROR root return func(self, request, context)
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib/python2.7/site-packages/ember_csi/base.py", line 351, in CreateVolume
2019-01-29 13:02:13.977 1 ERROR root return self._convert_volume_type(vol)
2019-01-29 13:02:13.977 1 ERROR root File "/usr/lib/python2.7/site-packages/ember_csi/v1_0_0/csi.py", line 114, in _convert_volume_type
2019-01-29 13:02:13.977 1 ERROR root volume = types.Volume(**parameters)
2019-01-29 13:02:13.977 1 ERROR root ValueError: Protocol message Volume has no "volume_content_source" field.
2019-01-29 13:02:13.977 1 ERROR root
If we deploy Ember-CSI to use CSI spec v1 and the snapshotter sidecar the sidecar keeps calling the Probe method and isn't able to start successfully.
This is cased by a 1 second timeout on the call as explained in the snapshotter issue 89 and we have proposed a PR to resolve it, but in the meantime we should be able to disable the probe via env variable.
The container only has cramfs and minix filesystem support.
We should add packages to support other filesystems, like xfs, ext3, ext4, btrfs, etc.
Otherwise the new possibility of provisioning filesystems is mostly useless.
Our CI is only running csi-sanity against LVM and it would be beneficial for the project if other storage solutions could be connected to the project so that patches are tested against different storage systems.
CSI sanity versions supporting snapshots expect the CSI plugin to support deleting a volume that has snapshots, but in our plugin this will depend on the Cinder implementation. This means that in some cases we'll fail the delete volume operation.
The CSI spec is not clear on this, as mentioned in container-storage-interface/spec#346, but we'll assume that csi-sanity is behaving in the intended manner until this is clearly defined on the CSI spec.
Since we changed to using oslo.log for logging we are no longer showing the CSI and Cinder versions.
Ember-CSI only supports controller and node split architecture. It would be interesting if it could also support all-in-one architectures.
In Kubernetes, with this architecture we could remove the controller publish and unpublish calls which should make it slightly faster since there are fewer grpc calls.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.