I don't believe there is a clean way right now to do one-off/side-effecting (and often

Feature request for supporting side-effects on add/delete about metacontroller HOT 7 CLOSED

googlecloudplatform commented on September 24, 2024

Feature request for supporting side-effects on add/delete

from metacontroller.

Comments (7)

rlguarino commented on September 24, 2024

That's an interesting idea, can you give an example of what you would do with these? Witnessing a create or delete of a resource in the Kubernetes API with the interface that metacontroller uses is not reliable. The k8s client ecosystem is build around the paradigm or reconciling state not processing events.

If metacontroller registered an admission controller it would be possible to reliably send creation and deletion notifications sync requests.

from metacontroller.

nikhilk commented on September 24, 2024

A concrete example is to create GCS buckets/objects, or entries in a db, or publishing to a pub/sub topic etc. etc. What is the alternative to handling existing systems that aren't and won't be in k8s?

I am not sure what you mean by the admission controller idea, but at a practical level, I want to write a single set of hooks constituting all the logic associated with my CRD/controller, and create a single metacontroller controller that is responsible for invoking those hooks. So yes, having metacontroller do the necessary additional plumbing would be helpful IMO.

from metacontroller.

enisoc commented on September 24, 2024

In the conventional Kubernetes controller model, it's considered an anti-pattern if the control code needs to be told whether an object is being added, or whether it already existed and is being updated. The idea is that controllers should examine the actual state of the world and act to bring it towards the desired state, regardless of what it may have already done before.

For example, suppose your code receives an "added" message for an object in your CRD, and creates a GCS bucket in response. Later, some other actor deletes the GCS bucket. Since the state of the world no longer matches the desired state, your controller ought to recreate the GCS bucket. However, if you only put the "create GCS bucket" code in your "object added" handler, you'll fail to fix the problem.

Instead, on each invocation, your code should assume nothing about what happened in the past. If the desired state is that a certain GCS bucket ought to exist, you should check whether it exists, and create it if necessary. This kind of continuous, level-triggered reconciliation is what makes Kubernetes resilient to complex and unforeseen combinations of system states, in which an event-triggered system would get stuck.

from metacontroller.

nikhilk commented on September 24, 2024

Don't disagree with the thinking - but wonder if in reality things are a bit more nuanced, and you need to break away knowingly.

For example, in the GCS example, you can theoretically create when needed, but what if in your system it matters when the object was created, i.e. the timestamp is interesting metadata. So [re-]creating the object in a future sync won't achieve the required semantics, and in fact in that particular scenario, it might be fine to deal with an inadvertent delete as an explicit/1st class error scenario.

For something a bit more realistic, what about situations where you don't even have the option to check/reconcile? Eg. sending a pub/sub notification when something happens, or to bring it a bit closer home (wrt to metacontroller), what if you yourself have to invoke a user provided web hook (in an external system) on deletion.

As I workaround I can build my own persistence layer that records whether the notification has been sent or not, and send it. However, will my hook even be triggered for a parent resource being deleted, for my code to be able to look up this controller-specific record of things?

I guess my meta-point here is things are generally more varied, esp. in the world where the k8s system fits into a larger system, and there is distinction between best practices/guidelines, and hard limitations.

Thoughts?

from metacontroller.

enisoc commented on September 24, 2024

what if in your system it matters when the object was created, i.e. the timestamp is interesting metadata. So [re-]creating the object in a future sync won't achieve the required semantics

Kubernetes controllers are inherently asynchronous, so there will always be some delta between when the user's "create CRD" request finished, and when you actually created the GCS bucket in response. If recreating "later" is undesired, that implies there is some threshold at which the timestamp delta is too big to be acceptable. Your sync logic would make this explicit:

If the bucket doesn't exist:
- If creating it now will still satisfy the time delta requirement:
  - Create the bucket.
- Else:
  - Treat it as a special error case. Either we were too slow, or it got deleted, but in any case our creation timestamp will be too far off at this point.

I understand this creation timestamp example wasn't meant to be fully representative of the set of scenarios you're thinking about, but my meta-point is that we've found most things in the world of infra can be expressed in a level-triggered formulation like this, and additionally that doing so provides significant benefits for overall system reliability (automatic self-healing) and debuggability (thinking in terms of independent agents with goals rather than a web of interconnected signal propagators).

I agree that there remain cases when it's necessary to break away from this pattern knowingly, but our experience so far has shown it's much less common than most people think. As a result, one of Metacontroller's design goals is to lay out rails that encourage level-triggered semantics as much as possible. I'm willing to consider event-triggered escape hatches, but I'd want to see strong evidence that a reasonably comprehensible level-triggered formulation of those use cases does not exist.

what about situations where you don't even have the option to check/reconcile? Eg. sending a pub/sub notification when something happens

In general, the underlying watch stream that the Kubernetes API server provides is not a good fit for use cases that expect at-least- or at-most-once semantics; it doesn't even try to provide approximately-once. For example, at any time you might be forced to re-list the current state of objects, and have no way to find out what events happened before the point-in-time of that re-list. The API server is only designed to support eventually consistent, level-triggered systems that are fine with this.

Metacontroller could connect you up to the "add" event from this stream directly, but it wouldn't be a good idea to send a pub/sub event from that handler unless it does its own deduplication. The "add" handler would still have the potential to get called any number of times >=0 because of the nature of the underlying watch stream. If your pub/sub triggering has its own deduplication, you'd be better off sending it from the sync hook, which unlike "add" is guaranteed to get called at-least-once-eventually if the object survives long enough. If you want to optimize for lower load on the deduper, you could mark in your returned Status that you sent the event; you still might send multiple times, but there will be a reasonable cap.

what if you yourself have to invoke a user provided web hook (in an external system) on deletion
[...]
will my hook even be triggered for a parent resource being deleted

Currently, no, but this is planned: #60. In that case, though, the finalize hook you write must still be idempotent because it might be called and retried multiple times. Similar to the pub/sub example above, this won't help you if you require the semantics of a deduped message queue.

from metacontroller.

nikhilk commented on September 24, 2024

Thanks for the detail here as well. Appreciating the discussion to help rethink the problem/use-case, and where this won't take the place of a de-duping message queue.

I found this seemed a useful read (https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d). May be there is something else that one ought to read through to more deeply internalize. If so, do let me know.

Reading through #60, yes, I think it would help. As opened, #60 pertains to DecoratorController, but presumably the functionality applies to CompositeController too?

from metacontroller.

enisoc commented on September 24, 2024

That article is already the one I most often link to. :) There are also some new docs written by upstream k8s maintainers in progress here:

http://book.kubebuilder.io/basics/what_is_a_controller.html

Yes, the solution for #60 will also apply to CompositeController. I'll rename the issue to clarify.

from metacontroller.

Feature request for supporting side-effects on add/delete about metacontroller HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs