Comments (7)
Having multiple copies of the same controller operating simultaneously is asking for race conditions. Currently we prevent this using controller manager level leader election. I am assuming that you have already given one of the two CCMs an alternate leader election name to circumvent the leader election protection. If you want to go that route I would suggest using the controllers flag (https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/controller-manager/options/generic.go#L65) to control the controllers in each CCM to ensure that each controller is only running in one of the two CCMs.
from cloud-provider.
Sorry, I'll add more details:
I have two cloud providers with a different ProviderName (nameA/NameB). I have deployments for each cloud-provider.
Deployment(CCM) has its own lock resource --leader-elect-resource-name=nameA/nameB
. It solves race conditions for one type of CCM (for one cloud provider).
All CCMs can initialize the nodes and set providerID (nameA://instanceId
or nameB://instanceId
)
And based on providerName(nameX://
), CCMs can choose the right node and make a node lifecycle.
In this case CCM have to skip the nodes from another cloud provider. sergelogvinov@8468b74#diff-807a869d357013a377e7a6153dc3133491ca7dfb909690f0dfa2b1f2b873fa60R131-R135
And problem her - when the node joins the cluster, it does not have a providerID yet.
One of CCMs will delete the node because a cloud provider does not have this node. It belongs to another cloud provider.
If wait a few second, another CCM initialize the node and set providerID. sergelogvinov@8468b74#diff-807a869d357013a377e7a6153dc3133491ca7dfb909690f0dfa2b1f2b873fa60R162-R169
from cloud-provider.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
from cloud-provider.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
from cloud-provider.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
from cloud-provider.
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
from cloud-provider.
I have this same question: how has anyone implemented a multi-cloud kubernetes cluster without this?
Related: #35
from cloud-provider.
Related Issues (20)
- Pass cli flags to cloud provider registration HOT 5
- controllermanager.go version is not shown in the log correctly HOT 3
- Labeling LoadBalancer service doesn't invoke EnsureLoadBalancer logic HOT 5
- UpdateLoadBalancer target services are not deterministic when starting process HOT 5
- Kubelet no longer restricts InternalIP to --node-ip after upgrade to CCM HOT 16
- node lifecycle controller should delete node if failed to check if node is shutdown because of 404
- LoadBalancer controller: nodes listing with externalTrafficPolicy == "local" HOT 7
- Service Controller can call provider EnsureLoadBalancer with a dirty Service object HOT 11
- [Feature discussion] Loadbalancer support to route traffic directly to Pods instead of NodePort. HOT 5
- Usage of IPs returned by `InstancesV2().InstanceMetadata()`, and interaction with `--node-ip` HOT 14
- app.NewCloudControllerManagerCommand additionalFlags not working as expected HOT 6
- Meaning of HasClusterID() ? HOT 7
- Gateway API integration HOT 4
- Node deletion via CCM HOT 5
- RFE: Ability to return arbitrary node labels from cloud provider HOT 6
- finer grained logs in cloud-provider libs HOT 4
- Implementing a cloud-controller-manager without golang (i.e. as an REST API server or in another language HOT 5
- Prevent Empty ProviderID in CloudNodeLifecycleController HOT 2
- HasClusterID() and allow-untagged-cloud
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloud-provider.