GithubHelp home page GithubHelp logo

Comments (7)

andrewsykim avatar andrewsykim commented on July 27, 2024

We are able to perform health check with port from service healthCheckNodePort, but all cluster nodes in NLB members looks like huge overhead.

This is expected behavior -- as you mentioned, using the healthCheckNodePort ensures only nodes with the endpoint is being used by the LB

In case externalTrafficPolicy == "local", use nodes with service endpoints in argument to EnsureLoadBalancer.
Call EnsureLoadBalancer (trigger event) on each service endpoints update (pod scaling, migration etc.)

My gut feeling is that this would be too costly. Services that have a lot of endpoints would churn a lot during a rolling update of the pods. Trying to add/remove backend nodes for an LB would make a lot of calls to the cloud provider and it's also possible that the cloud provider is not always able to add/remove backends as quickly as kube-proxy would. But we can make fairly safe assumptions that health check failures from LBs are responded to quickly as they are designed for this type of failure.

from cloud-provider.

CharlieR-o-o-t avatar CharlieR-o-o-t commented on July 27, 2024

This is expected behavior -- as you mentioned, using the healthCheckNodePort ensures only nodes with the endpoint is being used by the LB

Yes, but load balancer will be in unhealthy state, because node with no 'localEndpoints' on it returns 50x error code.

My gut feeling is that this would be too costly. Services that have a lot of endpoints would churn a lot during a rolling update of the pods. Trying to add/remove backend nodes for an LB would make a lot of calls to the cloud provider and it's also possible that the cloud provider is not always able to add/remove backends as quickly as kube-proxy would. But we can make fairly safe assumptions that health check failures from LBs are responded to quickly as they are designed for this type of failure.

Thank you, I agree with it.

from cloud-provider.

k8s-triage-robot avatar k8s-triage-robot commented on July 27, 2024

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

from cloud-provider.

k8s-triage-robot avatar k8s-triage-robot commented on July 27, 2024

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

from cloud-provider.

k8s-triage-robot avatar k8s-triage-robot commented on July 27, 2024

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

from cloud-provider.

k8s-ci-robot avatar k8s-ci-robot commented on July 27, 2024

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from cloud-provider.

AllenXu93 avatar AllenXu93 commented on July 27, 2024

We are able to perform health check with port from service healthCheckNodePort, but all cluster nodes in NLB members looks like huge overhead.

This is expected behavior -- as you mentioned, using the healthCheckNodePort ensures only nodes with the endpoint is being used by the LB

In case externalTrafficPolicy == "local", use nodes with service endpoints in argument to EnsureLoadBalancer.
Call EnsureLoadBalancer (trigger event) on each service endpoints update (pod scaling, migration etc.)

My gut feeling is that this would be too costly. Services that have a lot of endpoints would churn a lot during a rolling update of the pods. Trying to add/remove backend nodes for an LB would make a lot of calls to the cloud provider and it's also possible that the cloud provider is not always able to add/remove backends as quickly as kube-proxy would. But we can make fairly safe assumptions that health check failures from LBs are responded to quickly as they are designed for this type of failure.

I think watch endpoint is necessary.
If cluster have thousands of nodes, every loadbalancer service will make LB Service add thousands listen for every node, but only several of them ( maybe service only bind one or two pod) work. And in some other case, like in 3-layer network, LB can direct link to pod ip.
I agree with listen endpoint is too costly.. In our case, we add label for LB service's endpoint, and only watch endpoints with label change of the subset.

from cloud-provider.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.