GithubHelp home page GithubHelp logo

Comments (7)

rlguarino avatar rlguarino commented on September 24, 2024

That's an interesting idea, can you give an example of what you would do with these? Witnessing a create or delete of a resource in the Kubernetes API with the interface that metacontroller uses is not reliable. The k8s client ecosystem is build around the paradigm or reconciling state not processing events.

If metacontroller registered an admission controller it would be possible to reliably send creation and deletion notifications sync requests.

from metacontroller.

nikhilk avatar nikhilk commented on September 24, 2024

A concrete example is to create GCS buckets/objects, or entries in a db, or publishing to a pub/sub topic etc. etc. What is the alternative to handling existing systems that aren't and won't be in k8s?

I am not sure what you mean by the admission controller idea, but at a practical level, I want to write a single set of hooks constituting all the logic associated with my CRD/controller, and create a single metacontroller controller that is responsible for invoking those hooks. So yes, having metacontroller do the necessary additional plumbing would be helpful IMO.

from metacontroller.

enisoc avatar enisoc commented on September 24, 2024

In the conventional Kubernetes controller model, it's considered an anti-pattern if the control code needs to be told whether an object is being added, or whether it already existed and is being updated. The idea is that controllers should examine the actual state of the world and act to bring it towards the desired state, regardless of what it may have already done before.

For example, suppose your code receives an "added" message for an object in your CRD, and creates a GCS bucket in response. Later, some other actor deletes the GCS bucket. Since the state of the world no longer matches the desired state, your controller ought to recreate the GCS bucket. However, if you only put the "create GCS bucket" code in your "object added" handler, you'll fail to fix the problem.

Instead, on each invocation, your code should assume nothing about what happened in the past. If the desired state is that a certain GCS bucket ought to exist, you should check whether it exists, and create it if necessary. This kind of continuous, level-triggered reconciliation is what makes Kubernetes resilient to complex and unforeseen combinations of system states, in which an event-triggered system would get stuck.

from metacontroller.

nikhilk avatar nikhilk commented on September 24, 2024

Don't disagree with the thinking - but wonder if in reality things are a bit more nuanced, and you need to break away knowingly.

For example, in the GCS example, you can theoretically create when needed, but what if in your system it matters when the object was created, i.e. the timestamp is interesting metadata. So [re-]creating the object in a future sync won't achieve the required semantics, and in fact in that particular scenario, it might be fine to deal with an inadvertent delete as an explicit/1st class error scenario.

For something a bit more realistic, what about situations where you don't even have the option to check/reconcile? Eg. sending a pub/sub notification when something happens, or to bring it a bit closer home (wrt to metacontroller), what if you yourself have to invoke a user provided web hook (in an external system) on deletion.

As I workaround I can build my own persistence layer that records whether the notification has been sent or not, and send it. However, will my hook even be triggered for a parent resource being deleted, for my code to be able to look up this controller-specific record of things?

I guess my meta-point here is things are generally more varied, esp. in the world where the k8s system fits into a larger system, and there is distinction between best practices/guidelines, and hard limitations.

Thoughts?

from metacontroller.

enisoc avatar enisoc commented on September 24, 2024

what if in your system it matters when the object was created, i.e. the timestamp is interesting metadata. So [re-]creating the object in a future sync won't achieve the required semantics

Kubernetes controllers are inherently asynchronous, so there will always be some delta between when the user's "create CRD" request finished, and when you actually created the GCS bucket in response. If recreating "later" is undesired, that implies there is some threshold at which the timestamp delta is too big to be acceptable. Your sync logic would make this explicit:

  • If the bucket doesn't exist:
    • If creating it now will still satisfy the time delta requirement:
      • Create the bucket.
    • Else:
      • Treat it as a special error case. Either we were too slow, or it got deleted, but in any case our creation timestamp will be too far off at this point.

I understand this creation timestamp example wasn't meant to be fully representative of the set of scenarios you're thinking about, but my meta-point is that we've found most things in the world of infra can be expressed in a level-triggered formulation like this, and additionally that doing so provides significant benefits for overall system reliability (automatic self-healing) and debuggability (thinking in terms of independent agents with goals rather than a web of interconnected signal propagators).

I agree that there remain cases when it's necessary to break away from this pattern knowingly, but our experience so far has shown it's much less common than most people think. As a result, one of Metacontroller's design goals is to lay out rails that encourage level-triggered semantics as much as possible. I'm willing to consider event-triggered escape hatches, but I'd want to see strong evidence that a reasonably comprehensible level-triggered formulation of those use cases does not exist.

what about situations where you don't even have the option to check/reconcile? Eg. sending a pub/sub notification when something happens

In general, the underlying watch stream that the Kubernetes API server provides is not a good fit for use cases that expect at-least- or at-most-once semantics; it doesn't even try to provide approximately-once. For example, at any time you might be forced to re-list the current state of objects, and have no way to find out what events happened before the point-in-time of that re-list. The API server is only designed to support eventually consistent, level-triggered systems that are fine with this.

Metacontroller could connect you up to the "add" event from this stream directly, but it wouldn't be a good idea to send a pub/sub event from that handler unless it does its own deduplication. The "add" handler would still have the potential to get called any number of times >=0 because of the nature of the underlying watch stream. If your pub/sub triggering has its own deduplication, you'd be better off sending it from the sync hook, which unlike "add" is guaranteed to get called at-least-once-eventually if the object survives long enough. If you want to optimize for lower load on the deduper, you could mark in your returned Status that you sent the event; you still might send multiple times, but there will be a reasonable cap.

what if you yourself have to invoke a user provided web hook (in an external system) on deletion
[...]
will my hook even be triggered for a parent resource being deleted

Currently, no, but this is planned: #60. In that case, though, the finalize hook you write must still be idempotent because it might be called and retried multiple times. Similar to the pub/sub example above, this won't help you if you require the semantics of a deduped message queue.

from metacontroller.

nikhilk avatar nikhilk commented on September 24, 2024

Thanks for the detail here as well. Appreciating the discussion to help rethink the problem/use-case, and where this won't take the place of a de-duping message queue.

I found this seemed a useful read (https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d). May be there is something else that one ought to read through to more deeply internalize. If so, do let me know.

Reading through #60, yes, I think it would help. As opened, #60 pertains to DecoratorController, but presumably the functionality applies to CompositeController too?

from metacontroller.

enisoc avatar enisoc commented on September 24, 2024

That article is already the one I most often link to. :) There are also some new docs written by upstream k8s maintainers in progress here:

http://book.kubebuilder.io/basics/what_is_a_controller.html

Yes, the solution for #60 will also apply to CompositeController. I'll rename the issue to clarify.

from metacontroller.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.