Rules need to be evaluated in topological sort order in order to respect data dependen

Topological sorting of rules about prometheus HOT 15 CLOSED

juliusv commented on April 30, 2024

Topological sorting of rules

from prometheus.

Comments (15)

brian-brazil commented on April 30, 2024

I suspect that this problem has corner cases that are unresolvable in the general case, such as when rules with the same name are applied to many variables, and thus that correctness can't be guaranteed. A mostly-working linter/sanity checker should be an achievable goal though.

from prometheus.

juliusv commented on April 30, 2024

Yeah, you can create two rules with the same name, but you usually wouldn't encourage that, right? Maybe we should even forbid that? Or are there good use cases for this?

I was btw. not thinking about a linter, but about Prometheus automatically sorting rules topologically according to their dependencies on other rules.

from prometheus.

brian-brazil commented on April 30, 2024

Yeah, you can create two rules with the same name, but you usually wouldn't encourage that, right? Maybe we should even forbid that?

I think it's an option to forbid it depending on what exact naming and rule conventions we decide on.
Even if could make that work out in theory including all the knock-on effects, in practice I expect many users won't follow the conventions and or end up with clashes on metrics names. I'm willing to bet we'll have clashes just within the "official" client libraries for something like GC, before you even get to applications themselves.

There's also cases like rules that depend on themselves.

from prometheus.

stapelberg commented on April 30, 2024

I think as a first step it’d be a good idea to evaluate rules in the order in which they were defined in the config file.

from prometheus.

fabxc commented on April 30, 2024

Currently we are reading the files and their rules in order. On each evaluation iteration, we evaluate concurrently, though, because naive sequential evaluation can take longer than the evaluation interval. In that case iterations are missed.

Side note: if the a single query in the concurrent evaluation takes longer, a whole iteration will be missed for all rules. We should not be waiting for all rules in an iteration to finish and only skip iterations for the slow queries.

from prometheus.

brian-brazil commented on April 30, 2024

I think as a first step it’d be a good idea to evaluate rules in the order in which they were defined in the config file.

I've proposed that previously, some users are already at a point where that don't work for the reason @fabxc explains.

The last time we talked about this the plan was to add syntax to let you define a groups of rules, which would be run in order. Different groups could run at the same time then. This could also be used to enable things like allowing rules to access remote storage (which you wouldn't want enabled by default).

from prometheus.

fabxc commented on April 30, 2024

Mh, it seems like we can solve this internally. Not sure whether taking the issue to the user is the best approach here.

Building a dependency graph will also be more optimal than one level of logical grouping by the user.

from prometheus.

brian-brazil commented on April 30, 2024

Building a dependency graph will also be more optimal than one level of logical grouping by the user.

This is only one aspect to the problem. What we also need to do is spread out the load of the rule evaluations and allow for cases when there's loops or things we can't determine about a dependency graph.

I'm think ordered groups of rules with some micro-optimisation where it's safe is the route to take.

from prometheus.

fabxc commented on April 30, 2024

I would think that cyclic dependencies error when reading the rules. Or are
there reasonable cases where it should be allowed?
With that, evaluating one chain of rules without blocking independent ones
is not an issue.

On Fri, May 22, 2015 at 12:48 PM Brian Brazil [email protected]
wrote:

Building a dependency graph will also be more optimal than one level of
logical grouping by the user.

This is only one aspect to the problem. What we also need to do is spread
out the load of the rule evaluations, allow for cases when there's loops or
things we can't determine about a dependency graph.

I'm think ordered groups of rules with some micro-optimisation where it's
safe is the route to take.

—
Reply to this email directly or view it on GitHub
#17 (comment)
.

from prometheus.

brian-brazil commented on April 30, 2024

I would think that cyclic dependencies error when reading the rules. Or are
there reasonable cases where it should be allowed?

There are advanced use cases where it comes up. It's also possible that there's rules that aren't actually cyclic, but where it's not possible for us to determine that due to e.g. use of regexes. You also wouldn't want to couple rule evaluation between jobs, as that could lead to a problem in one job taking out evaluation in another.

from prometheus.

fabxc commented on April 30, 2024

Can you elaborate on the "use of regexes" part. I know that there were plans to have relabel() in the QL at some point - are you referring to that?

from prometheus.

juliusv commented on April 30, 2024

@fabxc For example, if someone uses a regex matcher on the metric name in the rule expression, you can't analyze anymore whether the resulting metric name(s) would need some rule to be executed first. Same for other labels (you could have parts of one metric being selected in one rule, and other parts in another rule).

from prometheus.

fabxc commented on April 30, 2024

Ah, those regexes – of course, thanks!

from prometheus.

juliusv commented on April 30, 2024

Superseded by #1095.

from prometheus.

lock commented on April 30, 2024

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

from prometheus.

Topological sorting of rules about prometheus HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs