kubernetes-csi / csi-lib-utils Goto Github PK
View Code? Open in Web Editor NEWCommon code for Kubernetes CSI sidecar containers (e.g. `external-attacher`, `external-provisioner`, etc.)
License: Apache License 2.0
Common code for Kubernetes CSI sidecar containers (e.g. `external-attacher`, `external-provisioner`, etc.)
License: Apache License 2.0
Let's standardize on using the Lease mechanism for all the sidecars, and try to have common arguments too.
I have observed same kind of issue in various kubernetes-csi project.
this happens because after the localization there are too much modifications done in the various directories.
I have observed same issue in this page also.
It has one broken link of the contributes cheat sheet
which needs to fix.
I will try to look in further csi repo as well and try to fix it as soon as I can
/kind bug
/assign
The protosanitizer.StripSecrets is un/marshalling every request to identify sensitive information and replace it. This operation seems to be too costly. Many CSI drivers (or other components) can print the secrets in logs when they're configured e.g. in the StorageClass. The impact of the issue is zero to little but still it might be good to have all the logging sanitized.
I have attempted to fix this in the GCP: kubernetes-sigs/gcp-compute-persistent-disk-csi-driver#747 however the fix was reverted precisely because of the performance impact of the StripSecrets function.
Would it be possible to try to for example identify or replace the secret without the expensive JSON operations? Or any other idea if you could come with some.
As a suggestion to align with the following structured logging guidelines, I've explored updating the klog functions in csi-lib-utils:
https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md
I noticed that several CSI Sidecars like external-snapshotter rely on csi-lib-utils. To potentially enhance structured logging support in these Sidecars, adopting structured logging in csi-lib-utils could be beneficial.
The current process_start_time_seconds does not work because a bug in the component-base library. The fix is here:
kubernetes/kubernetes#96435
But that can take pretty long time before we can upgrade the dependency. So until then, we can use the workaround #68 to fix that.
This issue is to track that we need to revert this workaround back after we upgrade the dependency.
As CSI secret handling logic can be used by multiple controllers, it should be moved to this repository.
Currently all the sidecars copy paste the same connection logic
/help
We should run make test
or some equivalent of it before each PR merge. As this is a Kubernetes repo and we already use prow for merging, we should also use prow for testing.
We need a new release so we can propagate new Connect
function to all sidecar containers.
/assign @saad-ali
Exact root cause is still uncertain, but when apiserver is having problems, the csi sidecars will fail to get the leader election lease with this error:
"error retrieving resource lock kube-system/external-attacher-leader-my-driver: Get https://localhost:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/external-attacher-leader-my-driver: write tcp [::1]:53540->[::1]:443: write: broken pipe"
Even after apiserver comes back up, this error continues and never recovers. This is apparently intended behavior, and the fix is to enable watchdog so that kubelet can restart the container: https://github.com/kubernetes/client-go/blob/master/tools/leaderelection/healthzadaptor.go#L25
In-tree controllers like kube-controller-manager already set this.
Moving kubernetes-csi/driver-registrar#76 to here.
I'm proposing to add some basic tracing instrumentation of the gRPC calls made by the CSI clients using the code in connection.go to connect to the CSI driver.
The feature simply consists in adding the otelgrpc.UnaryClientInterceptor() to the existing ChainUnaryInterceptor. This is enough to create traces for all gRPC calls and they can be easily exported by the client.
I think it can be done by creating a new ConnectWithGrpcInterceptor
function that will call the connect
function with the added interceptor as a DialOption
. This way the feature will be opt-in for the users.
I can contribute it (and the implementation in the CSI components) if it is something the community wants to see. I already implemented something similar on the azuredisk-csi-driver and plan to do the same for aws and gcp drivers.
We were going through with this PR and have found that @ConnorJC3 and team has introduced a timeout limit as part of bbcd132.
Can someone please provide insight into the rationale behind implementing this timeout?
Given that many CSI-sidecars rely on this utility, imagine a scenario with two controllers where only one is the leader. In such a case, the non-leading controller is unable to connect. In the previous setup with an infinite timeout, the controllers attempted to establish a connection indefinitely. As a result, the controller pod remained in a running state, avoiding crashloopbackoff.
Once the pod assumed the leader role, the session was established seamlessly but because of this change now pod is in crashloopbackoff.
I see below functions are present on both connection
and rpc
package
func GetDriverName(ctx context.Context, conn *grpc.ClientConn) (string, error)
func GetPluginCapabilities(ctx context.Context, conn *grpc.ClientConn)
func GetControllerCapabilities(ctx context.Context, conn *grpc.ClientConn) (ControllerCapabilitySet, error)
func ProbeForever(conn *grpc.ClientConn, singleProbeTimeout time.Duration) error
func probeOnce(conn *grpc.ClientConn, timeout time.Duration)
func Probe(ctx context.Context, conn *grpc.ClientConn) (ready bool, err error)
is this intentional or do we need to remove duplicate functions and make use of a single package?
if we need to remove it. I can do it.
Currently, some of the CSI sidecars support pprof profliing (like the azuredisk-csi-driver, the secrets-store-csi-driver or the node-driver-registrar but not all. I believe it could be useful to add this option on all CSI components and this repo seems like a good place to do it.
It could probably go in the metrics.go file in the CSIMetricsManager
interface with a RegisterPprofToServer
function that would register the pprof handlers.
What do you folks think about it? I would be happy to contribute this (and the implementation in the CSI sidecars) if that is something folks are willing to see ๐
hi,
it seems the CSI lease timers are all hard coded, could that be enhanced as configurable?
in case it's used in some slow environment, as other lease? like kube-scheduler/kube-controller-manager
defaultLeaseDuration = 15 * time.Second
defaultRenewDeadline = 10 * time.Second
defaultRetryPeriod = 5 * time.Second
thanks,
Now:
k8s.io/client-go v1.19.0
Should be:
k8s.io/client-go v0.19.0
Error message:
go: github.com/kubernetes-csi/[email protected] requires
k8s.io/[email protected]: reading k8s.io/client-go/go.mod at revision v1.19.0: unknown revision v1.19.0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.