GithubHelp home page GithubHelp logo

Comments (6)

julianbrost avatar julianbrost commented on May 26, 2024

ref/IP/52794

from icinga2.

Al2Klimov avatar Al2Klimov commented on May 26, 2024

Actually a single recovery outside the notification time period is enough not to be notified. I tested the master branch with Icinga DB and no special load btw.

from icinga2.

Al2Klimov avatar Al2Klimov commented on May 26, 2024

If a recovery happens outside the notification time period, NotificationRecovery is added to Notification#suppressed_notifications for FireSuppressedNotifications() to pick it up once inside time period again. But this function clears such bits unless Checkable#NotificationReasonApplies(). One may think: checkable still OK = recovery still applies. But for NotificationRecovery NotificationReasonApplies() also compares the current state to Checkable#state_before_suppression which is OK by default. In a nutshell:

  • We're outside the notification time period
    • A recovery happens
      • NotificationRecovery is added to Notification#suppressed_notifications 👍
    • FireSuppressedNotifications() runs (every 5s)
      • Checks Notification#suppressed_notifications bits' validity
        • Consults Checkable#NotificationReasonApplies() for NotificationRecovery
          • ⚠️Latter says: Yes, we're still OK, but Checkable#state_before_suppression (completely unrelated here) is also OK (its default), so I think we just returned to our OK – so no, NotificationRecovery doesn't apply anymore
        • NotificationRecovery is cleared from Notification#suppressed_notifications and lost 👎

As I consider Checkable#state_before_suppression unrelated here, I felt free to comment out 39cee35 and it worked:

--- lib/icinga/checkable-notification.cpp
+++ lib/icinga/checkable-notification.cpp
@@ -271,7 +271,7 @@ bool Checkable::NotificationReasonApplies(NotificationType type)
                case NotificationRecovery:
                        {
                                auto cr (GetLastCheckResult());
-                               return cr && IsStateOK(cr->GetState()) && cr->GetState() != GetStateBeforeSuppression();
+                               return cr && IsStateOK(cr->GetState());// && cr->GetState() != GetStateBeforeSuppression();
                        }
                case NotificationFlappingStart:
                        return IsFlapping();

@julianbrost Any opinion on this (as the commit author) before anyone codes anything?

from icinga2.

julianbrost avatar julianbrost commented on May 26, 2024

Have you checked whether the this test still passes with that change? I have the feeling that this might result in extra recovery notifications after downtimes when no problem notification was sent.

from icinga2.

Al2Klimov avatar Al2Klimov commented on May 26, 2024

That's just a PoC. My actual suggestion is not to consult GetStateBeforeSuppression() unconditionally in NotificationReasonApplies(), but only if GetSuppressedNotifications() contains NotificationRecovery or NotificationProblem. Because only then GetStateBeforeSuppression() matters IMAO:

const int stateNotifications = NotificationRecovery | NotificationProblem;
if (!(suppressed_types_before & stateNotifications) && (suppressed_types & stateNotifications)) {
/* A state-related notification is suppressed for the first time, store the previous state. When
* notifications are no longer suppressed, this can be compared with the current state to determine
* if a notification must be sent. This is done differently compared to flapping notifications just above
* as for state notifications, problem and recovery don't always cancel each other. For example,
* WARNING -> OK -> CRITICAL generates both types once, but there should still be a notification.
*/
SetStateBeforeSuppression(old_stateType == StateTypeHard ? old_state : ServiceOK);
}

from icinga2.

julianbrost avatar julianbrost commented on May 26, 2024

Okay, now I think I got it. So Checkable::NotificationReasonApplies() is used both when sending notification after the suppression reason is gone on the Checkable level (when a downtime ends for example) and on the Notification level (when a period on a Notification object enters for example) and it checks Checkable::GetStateBeforeSuppression() in both cases even though it was only set to something useful in the first case.

My actual suggestion is not to consult GetStateBeforeSuppression() unconditionally in NotificationReasonApplies(), but only if GetSuppressedNotifications() contains NotificationRecovery or NotificationProblem. Because only then GetStateBeforeSuppression() matters IMAO:

So yes, this could work. Maybe moving the check of Checkable::GetStateBeforeSuppression() out of Checkable::NotificationReasonApplies() to where it's actually necessary (i.e. suppressed notifications on the Checkable level) might be an option as well.

from icinga2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.