What happened? After restart Kubelet, node will become notReady in

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

But in <a href="https://github.com/kubernetes/kubernetes/issues/122338#is

<div class="highlight highlight-source-diff notranslate position-relative overflow-auto" dir="auto"

After restart Kubelet, node will become notReady in first kubelet update period.,about kubernetes/kubernetes

Comments (20)

aojea commented on June 16, 2024 2

I may be missing something but what happens if something has changed that caused the kubelet to not be ready ... imagine, the Network plugin is not working ... during the initial period until the check changes the state pods will be scheduled on the node and will fail ....

I think that we are operating the assumption that this is a restart and during that time nothing changed, but is this a safe assumption or can we guarantee somehow that nothing has changed that could impact the node readiness state?

from kubernetes.

k8s-ci-robot commented on June 16, 2024

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from kubernetes.

AllenXu93 commented on June 16, 2024

/sig node

from kubernetes.

k8s-ci-robot commented on June 16, 2024

@AllenXu93: The label(s) sig/ cannot be applied, because the repository doesn't have them.

In response to this:

/sig node

from kubernetes.

HirazawaUi commented on June 16, 2024

I can reproduce this issue in v1.30

When the container runtime is healthy, the kubelet should not report KubeletNotReady with the reason runtime status check may not have completed yet

We called kl.updateRuntimeUp() in the fastStatusUpdateOnce function to update the status of the container runtime, but we did not do this in syncNodeStatus,

kubernetes/pkg/kubelet/kubelet_node_status.go

Lines 467 to 469 in 37ca037

 // This is in addition to the regular syncNodeStatus logic so we can get the container runtime status earlier. 

 // This function itself has a mutex and it doesn't recursively call fastNodeStatusUpdate or syncNodeStatus. 

 kl.updateRuntimeUp()

We only need to execute kl.updateRuntimeUp() function once before fastStatusUpdateOnce and syncNodeStatus, just like we did in the fastStatusUpdateOnce function, than this problem can be avoided.

But in #122338 (comment), @aojea seems to have some different opinions...

from kubernetes.

AllenXu93 commented on June 16, 2024

I can reproduce this issue in v1.30

When the container runtime is healthy, the kubelet should not report KubeletNotReady with the reason runtime status check may not have completed yet

We called kl.updateRuntimeUp() in the fastStatusUpdateOnce function to update the status of the container runtime, but we did not do this in syncNodeStatus,

kubernetes/pkg/kubelet/kubelet_node_status.go

Lines 467 to 469 in 37ca037

// This is in addition to the regular syncNodeStatus logic so we can get the container runtime status earlier.

// This function itself has a mutex and it doesn't recursively call fastNodeStatusUpdate or syncNodeStatus.

kl.updateRuntimeUp()

We only need to execute kl.updateRuntimeUp() function once before fastStatusUpdateOnce and syncNodeStatus, just like we did in the fastStatusUpdateOnce function, than this problem can be avoided.

But in #122338 (comment), @aojea seems to have some different opinions...

Yeah, in our env, we modify kubelet the same way as you said, to let it execute kl.updateRuntimeUp once before syncNodeStatus, it worked.
I can submit a PR later.

from kubernetes.

AllenXu93 commented on June 16, 2024

I may be missing something but what happens if something has changed that caused the kubelet to not be ready ... imagine, the Network plugin is not working ... during the initial period until the check changes the state pods will be scheduled on the node and will fail ....

I think that we are operating the assumption that this is a restart and during that time nothing changed, but is this a safe assumption or can we guarantee somehow that nothing has changed that could impact the node readiness state?

In this issue, what I found is that nothing has changed, every time I restart kubelet, node will become notReady only in first sync period, in next period it will be ready; It's not an assumption.
Network plugin not working or other problem of course will cause node notReady, but they are not in this issue's scope, in this scope, notReady reason is container runtime status check may not have completed yet message, have not relationship with other problem;

from kubernetes.

aojea commented on June 16, 2024

But in #122338 (comment), @aojea seems to have some different opinions...

ok, I misread this issue sorry, so it is not to blindly set the node to ready, is just to perform the runtime check before the other checks ... I think you both are right ... actually it seems that if fastStatusUpdateOnce wins the race then there is no problem, righ @HirazawaUi ?

from kubernetes.

HirazawaUi commented on June 16, 2024

actually it seems that if fastStatusUpdateOnce wins the race then there is no problem, righ @HirazawaUi ?

Yes, but fastStatusUpdateOnce and syncNodeStatus run in different goroutines, so we cannot guarantee that fastStatusUpdateOnce will complete faster.

So it seems like a good choice to execute updateRuntimeUp before running syncNodeStatus, or do you have better suggestions?

from kubernetes.

aojea commented on June 16, 2024

diff --git a/pkg/kubelet/kubelet.go b/pkg/kubelet/kubelet.go
index af74a095628..cd8acc7fbf0 100644
--- a/pkg/kubelet/kubelet.go
+++ b/pkg/kubelet/kubelet.go
@@ -1626,23 +1626,27 @@ func (kl *Kubelet) Run(updates <-chan kubetypes.PodUpdate) {
        // Start volume manager
        go kl.volumeManager.Run(kl.sourcesReady, wait.NeverStop)
 
+       // Check the container runtime status.
+       // This has to run before kl.syncNodeStatus (https://issues.k8s.io/124397)
+       go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop)
+
        if kl.kubeClient != nil {
                // Start two go-routines to update the status.
                //
-               // The first will report to the apiserver every nodeStatusUpdateFrequency and is aimed to provide regular status intervals,
-               // while the second is used to provide a more timely status update during initialization and runs an one-shot update to the apiserver
+               // The first will is used to provide a more timely status update during initialization and runs an one-shot update to the apiserver
                // once the node becomes ready, then exits afterwards.
+               go kl.fastStatusUpdateOnce()
+
+               // The second will report to the apiserver every nodeStatusUpdateFrequency and is aimed to provide regular status intervals,
                //
                // Introduce some small jittering to ensure that over time the requests won't start
                // accumulating at approximately the same time from the set of nodes due to priority and
                // fairness effect.
                go wait.JitterUntil(kl.syncNodeStatus, kl.nodeStatusUpdateFrequency, 0.04, true, wait.NeverStop)
-               go kl.fastStatusUpdateOnce()
 
                // start syncing lease
                go kl.nodeLeaseController.Run(context.Background())
        }
-       go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop)
 
        // Set up iptables util rules
        if kl.makeIPTablesUtilChains {

https://go.dev/play/p/752NWud709S

We need to put the goroutines in the right order to make the startup more predictable

from kubernetes.

AllenXu93 commented on June 16, 2024

But in #122338 (comment), @aojea seems to have some different opinions...

ok, I misread this issue sorry, so it is not to blindly set the node to ready, is just to perform the runtime check before the other checks ... I think you both are right ... actually it seems that if fastStatusUpdateOnce wins the race then there is no problem, righ @HirazawaUi ?

Yes, if fastStatusUpdateOnce execute very fast, may there is no problem, because in fastStatusUpdateOnce it will execute updateRuntimeUp .
I have test in v1.28 (which code in fastStatusUpdateOnce almost same as 1.30 ) , it will not always reproduce, but still occurr some times.
In 1.22, fastStatusUpdateOnce will sleep for about 100ms before everything, it can be reproduced everytimes.

from kubernetes.

HirazawaUi commented on June 16, 2024

We need to put the goroutines in the right order to make the startup more predictable

Arranging the goroutines in order can solve our problems in most cases, but in the updateRuntimeUp method, we need to call the containerRuntime api to get the status. I am worried that its response will not be timely enough in special scenarios, there may be a risk that syncNodeStatus will complete faster.

kubernetes/pkg/kubelet/kubelet.go

Lines 2873 to 2878 in 2806ffe

 func (kl *Kubelet) updateRuntimeUp() { 

 kl.updateRuntimeMux.Lock() 

 defer kl.updateRuntimeMux.Unlock() 

 ctx := context.Background() 

 s, err := kl.containerRuntime.Status(ctx)

from kubernetes.

aojea commented on June 16, 2024

Arranging the goroutines in order can solve our problems in most cases

moving go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop) will execute first https://go.dev/play/p/752NWud709S

from kubernetes.

HirazawaUi commented on June 16, 2024

moving go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop) will execute first https://go.dev/play/p/752NWud709S

I think I must not have expressed clearly.
What I'm worried about is that under special circumstances, updateRuntimeUp calls the API of the container runtime and cannot return immediately (not sure if this situation really exists). Even if the updateRuntimeUp method is executed first, it will still complete later than syncNodeStatus :)

from kubernetes.

AllenXu93 commented on June 16, 2024

Arranging the goroutines in order can solve our problems in most cases

moving go wait.Until(kl.updateRuntimeUp, 5*time.Second, wait.NeverStop) will execute first https://go.dev/play/p/752NWud709S

Add some detail:
in updateRuntimeUp it call runtime api to check runtime status, then set lastBaseRuntimeSync variable .

kubernetes/pkg/kubelet/kubelet.go

Line 2913 in 9227001

kl.runtimeState.setRuntimeSync(kl.clock.Now())

in syncNodeStatus , it will check lastBaseRuntimeSync variable, if lastBaseRuntimeSync is nil, this issue's problem will occurr.

kubernetes/pkg/kubelet/kubelet_node_status.go

Line 748 in bf07ef3

 nodestatus.ReadyCondition(kl.clock.Now, kl.runtimeState.runtimeErrors, kl.runtimeState.networkErrors, kl.runtimeState.storageErrors, 

kubernetes/pkg/kubelet/runtime.go

Line 108 in 9227001

if s.lastBaseRuntimeSync.IsZero() {

Even if updateRuntimeUp goroutine call firstly, call container runtime check will still need some time, it can't guarantee that when syncNodeStatus firstly call, lastBaseRuntimeSync is setted.

from kubernetes.

aojea commented on June 16, 2024

are you suggesting to lock syncNodeStatus on updateRuntimeUp?

I'm afraid of some corner cases we can hit ... if the container runtime does not return do we block forever?

from kubernetes.

AllenXu93 commented on June 16, 2024

are you suggesting to lock syncNodeStatus on updateRuntimeUp?

I'm afraid of some corner cases we can hit ... if the container runtime does not return do we block forever?

Container runtime API call have timeout.

kubernetes/pkg/kubelet/cri/remote/remote_runtime.go

Lines 620 to 626 in 695a984

 func (r *remoteRuntimeService) Status(ctx context.Context, verbose bool) (*runtimeapi.StatusResponse, error) { 

 klog.V(10).InfoS("[RemoteRuntimeService] Status", "timeout", r.timeout) 

 ctx, cancel := context.WithTimeout(ctx, r.timeout) 

 defer cancel() 

 return r.statusV1(ctx, verbose) 

 }

from kubernetes.

AnishShah commented on June 16, 2024

Can you try this on a newer version? 1.22 is out of support and we have made changes to node readiness since.

/triage needs-information

from kubernetes.

AllenXu93 commented on June 16, 2024

Can you try this on a newer version? 1.22 is out of support and we have made changes to node readiness since.

/triage needs-information

I have tried in 1.28, it can reproduce, @HirazawaUi can reproduce in 1.30.

from kubernetes.

HirazawaUi commented on June 16, 2024

/remove-triage needs-information

from kubernetes.

After restart Kubelet, node will become notReady in first kubelet update period. about kubernetes HOT 20 OPEN

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	// This is in addition to the regular syncNodeStatus logic so we can get the container runtime status earlier.
	// This function itself has a mutex and it doesn't recursively call fastNodeStatusUpdate or syncNodeStatus.
	kl.updateRuntimeUp()

	func (kl *Kubelet) updateRuntimeUp() {
	kl.updateRuntimeMux.Lock()
	defer kl.updateRuntimeMux.Unlock()
	ctx := context.Background()

	s, err := kl.containerRuntime.Status(ctx)

	func (r remoteRuntimeService) Status(ctx context.Context, verbose bool) (runtimeapi.StatusResponse, error) {
	klog.V(10).InfoS("[RemoteRuntimeService] Status", "timeout", r.timeout)
	ctx, cancel := context.WithTimeout(ctx, r.timeout)
	defer cancel()

	return r.statusV1(ctx, verbose)
	}