-
Hello, I want to receive an alert if a condition remains true for a certain amount of time, and suppress the alert entirely if the condition clears within that time. For example, it's normal for the 15m load average to exceed 4 for a while when a box is doing a burst of activity, so I only want to be alerted if it remains stuck in this state for a long time. I tried "delay: up 30m ...", like this:
(This was edited using However, that doesn't seem to do what I thought - I still receive the alert immediately. For example, today the 15m load average graph (blue) went above 4 at 07:12 and I got an alert E-mail immediately:
The graph dropped below 4 at 07:39 And the clear E-mail arrived at 07:57
I suppose I could achieve what I want like this:
That is: if all the samples within the time window are above 4, then alert. But I'd still like to know what "delay: up 30m" is intended for, and why it didn't do what I expected. The documentation says:
The status did go from clear to warning, and yet with |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 9 replies
-
Hi @candlerb, you are not wrong, the delay should work as you described. We'll do some tests to verify, in the mean time though, is it possible maybe to update to the latest nightly version ? Although I'm not aware that we had such an issue then, it's always better to test on latest. Btw, could you double check in |
Beta Was this translation helpful? Give feedback.
-
Aside: I have another similar machine. It's running v1.42.1 and only a single copy is running. But when I try to update it:
It was like this a couple of hours ago. I'll re-check tomorrow. |
Beta Was this translation helpful? Give feedback.
-
In the mean time, I did some tests, with the The alert was raised at 16:42:33 : And I got the notification at 16:47 : Upon clearing, the notification was sent 3 minutes after. I tried a couple of more times, with up/down this alert, it appears to behave correctly here, but let's still check the case! |
Beta Was this translation helpful? Give feedback.
Could you please check, that you don't have 2 netdata instances running? The loaded alert seems like a very old
load.conf
config.Please also try: Stop netdata all-together, then re-create
load.conf
by usingedit-config
script?