ironcore-dev / ironcore-csi-driver Goto Github PK
View Code? Open in Web Editor NEWKubernetes Container Storage Interface (CSI) Driver for IronCore
Home Page: https://github.com/ironcore-dev
License: Apache License 2.0
Kubernetes Container Storage Interface (CSI) Driver for IronCore
Home Page: https://github.com/ironcore-dev
License: Apache License 2.0
Remove unimplemented containers (resizer, snapshotter)
Remove extra Env variables from the controller and node yaml
Currently onmetal-csi doesn't have implementation for resizer and snapshotter.
The current CSI driver does not have a liveness probe, which is necessary for checking the health of the driver.
Implement a CSI liveness probe to check the health of the driver and ensure that it is functioning correctly.
Without a liveness probe, it is difficult to determine the health of the CSI driver and ensure that it is functioning correctly. This can lead to unreliable operation.
After creating csi volume, need to create volume attachment for machine.
Implement the needful logic under ControllerPublishVolume api of CSI Driver
expected:
before mounting volume there is volume attachment created.
after unmount volume volume attachment is removed.
Add suite tests for Node
driver covering the correct behaviour of:
NodeStageVolume
NodePublishVolume
NodeUnstageVolume
NodeUnpublishVolume
NodeGetInfo
NodeGetCapabilities
In order to allow a proper local testing, the disk mounting and formatting part needs to be mocked. One possible solution is to have a Mount
interface like described here: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/pkg/driver/mount.go
Currently, in our external CSI driver, the CreateVolume
request is completed without ensuring that the Volume
reaches the Ready
state. For better reliability and to prevent downstream issues, it's crucial to ensure the volume is in the Ready
state before completing the request.
Polling Mechanism: Implement a timeout-based wait.Polling
after triggering the volume creation. This will continuously check if the volume has reached the Ready
state.
Error Handling: If the Volume
does not reach the Ready
state within the defined timeout, return an error indicating that volume creation has failed due to readiness timeout.
Ready
) before the request completes.Ready
state, enabling easier debugging and handling on their side. After a CreateVolume
request is made, the request should only complete after the volume reaches the Ready
state.
If the volume does not reach the Ready
state within the specified timeout, an appropriate error should be returned.
Currently we are using storage.onmetal.de
as the field owner base. I would recommend using csi.onmetal.de
instead as it indicates correctly that the CSI driver is owning the resource.
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-1
Explanation: Does README mention REUSE? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Need provision to configure namespace for csi-driver (using configmap)
csi driver should use node annotation to select correct machine and namespace. (req for volume attachment)
Note: this is temporary provision needed for csi driver implementation and testing.
In order to ensure a clean restart we need to ensure that the CSI node driver is not running into a conflict when finding an old CSI socket on disk. As the node driver runs in privileged mode, this might lead to unexpected startup behaviour. We should investigate how other CSI drivers are dealing with this issue.
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-4
Explanation: Is it compliant with REUSE rules? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-vulnerability_alerts-1
Explanation: Are vulnerability alerts enabled? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
As per recent PR https://github.com/onmetal/onmetal-csi-driver/pull/169, might need document review
Please review and update the document if required.
Currently we are using a KubeHelper
module to instantiate the Kubernetes cluster client for the CSI driver. It uses internally the InCluster
client method. This leads to a situation where we can not run multiple CSI on one runtime cluster to host multiple control planes.
Please add two flags for configuring the cluster/onmetal clients when starting the CSI driver. I would suggest the following two flags for that:
--kubeconfig=/PATH/TO/KUBECONFIG # path to the cluster kubeconfig, if not set use the `inCluster` method
--onmetal-kubeconfig=/PATH/TO/KUBECONFIG # kubeconfig for accessing the onmetal cluster
depend on #6
after successfully creating volume attachment, attach volume to Pod
expected:
csi driver should be able to mount volume to pod
csi driver should be able to unmount volume from pod
We need to consider adding some binaries to the distroless container the same way it has been done in the GCP CSI driver: https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/master/Dockerfile#L27-L78
When resizing a PVC
the resize call against the API for changing the Volume
size is called via the CSI controller. However the resizing of the mounted Volume
on the Node
is not working. It looks like the corresponding ResizeFS
is not working in the Node
controller.
Install the CSI into a cluster which has an expandable StorageClass
. Change the size of the PVC
and observe the changed size of the Volume
itself. When exec-ing into the Pod
using the Volume
the old size is still present. Also the PVC
volume size is unchanged.
Both the PVC
size and the size of the mounted Volume
on the host/Pod
should be changed to the new size.
Currently we have storageClass
parameters named as storage_pool
and storage_class_name
.
https://github.com/onmetal/onmetal-csi-driver/blob/main/pkg/driver/controller.go#L51
To avoid confusion, volume_pool
and volume_class
looks better w.r.t onmetal conventions instead of storage_pool
and storage_class_name
respectively.
Make the corresponding changes in all related files
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-2
Explanation: Does it have LICENSES directory with licenses? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-vulnerability_alerts-1
Explanation: Are vulnerability alerts enabled? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
Renaming tasks after project move:
ironcore
ironcore
naming schemegithub.com/ironcore-dev/controller-utils
is used and all imports are changed accordinglygithub.com/ironcore-dev/ironcore
is used and all imports are changed accordinglyironcore
is used in the Makefile
ironcore
is used in all docs and README filesimplementing custom logger for csi driver to have well defined logs.
log should be categorised like info, debug, error
Currently the project relies on golangci-lint
to be installed on the host in order to run the linting checks on the code base. To make the project more self-contained we should add a Makefile
directive like here: https://github.com/ironcore-dev/ironcore/blob/main/Makefile#L485-L488 and install the golangci-lint
tool into the tools bin folder and run the make lint
using that binary.
Currently we are announcing in the GetPluginInfo
method a "dev" version. This should be change to announce a proper version in the CSI driver. We should use an approach similar to that https://github.com/onmetal/kubectl-onmetal/blob/main/version/version.go to determine the version.
Currently, the OnMetal CSI (Container Storage Interface) Kubernetes driver does not support volume expansion. As users' storage needs grow, they are unable to easily resize their existing persistent volumes without manual intervention. This can lead to downtimes and increased operational complexity.
Describe the solution you'd like
I would like to request the implementation of volume expansion support in the OnMetal CSI Kubernetes driver, following the Kubernetes documentation on volume expansion. This feature would allow users to resize their existing persistent volumes without the need to create new volumes and migrate data manually.
The desired solution should:
ControllerExpandVolume
and NodeExpandVolume
RPC calls in the OnMetal CSI driver.EXPAND_VOLUME
controller capability.Describe alternatives you've considered
An alternative approach is to continue with the manual process of creating a new, larger volume, migrating the data, and updating the persistent volume claim (PVC) to point to the new volume. However, this is time-consuming, error-prone, and leads to increased operational overhead.
Additional context
Volume expansion is a valuable feature for users who need to adapt their storage resources to their applications' growing needs. The addition of this feature would bring the OnMetal CSI Kubernetes driver in line with other drivers and make it more attractive for users looking for a comprehensive storage solution.
Please let me know if there is any additional information needed or if there are any concerns about implementing this feature. To create this issue, visit OnMetal CSI driver repository and follow the process for creating a new issue.
Add proper resource limits into the containers of csi-controller
and csi-node
yaml files
Resource Management for Pods and Containers An example how to add them
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-assigned_teams-1
Explanation: Does it have enough teams on GitHub? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
A violation against the OSS Rules of Play has been detected.
Rule ID: rl-reuse_tool-3
Explanation: Is it registered in REUSE? No
Find more information at: https://sap.github.io/fosstars-rating-core/oss_rules_of_play_rating.html
On node if staged volume is already mounted, should just log the info and return instead of throwing error.
This needs to be fixed.
Since the gocsi dependency (https://github.com/rexray/gocsi) is not maintained we should remove and replace it.
On a long run we will run in issues with outdated stuff...
To make the code adhere to community standards / best practices and to make it more resilient, please check out the following tasks:
golangci-lint
and use the .golangci.yaml
from the onmetal-api. In the .github/workflows/golangci-lint.yml
, remove all --disable-all
parameters. These checks should not be disabled.time.Sleep
and instead observe the actual state of the component you're waiting on for the expected status.pkg/helper
entirely.
logr
instead of logrus
. Initialize a logger at the entrypoint of your application. Pass it through wherever it makes sense. If a static logger is required (only exceptional cases), follow the same strategy as here.kube_client
, use the controller-runtime client. Move the initialization of the client to the entrypoint of your driver.os_ops
entirely and directly call the os
functions.config
to match the structure of a kubebuilder controller. This means, default
kustomization, manager
deployment, no crds
for your project as yours shouldn't have any.scripts
/ move any parts that are required into hack
.Ginkgo
/ Gomega
and don't use the fake client set but instead write tests with kubebuilder's envtest.Environment
as far as possible. Remove all custom / own self-written mocks.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.