Counters not resetting but alerts still fire after 'unknown'

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
Post Reply
myxiplx
Posts: 21
Joined: Tue Apr 13, 2004 1:53 am

Counters not resetting but alerts still fire after 'unknown'

Post by myxiplx »

Hi,

We're getting odd behaviour when tests return 'unknown' status, is there any way we can get around this?

We have a standard alert profile for nearly all of our tests, it sends a simple e-mail or text message the first time the status changes from good to bad and vice versa. Both the good and bad alerts are set to fire after 1 consecutive result and to occur once only. The expected behaviour is for IT to be informed of any problem, and for them to also be informed when the problem is resolved.

However what we've just found is that when a test returns 'unknown' and then the next result is good, our good alert fires, even though no 'bad' test has ever occured, and the recurrences count in hostmonitor still says we've had thousands of good events.

This creates orphaned good alerts as we don't expect to see these unless we've first had a bad alert. But it was the count that was really confusing - it seems to be ignoring the 'unknown' results. We had a 'status ok' message this morning for a test that the recurrence count says has been fine for over two days. It was only by going through the hostmonitor log that we could work out what happened.

I think I have a few questions off the back of this:
1. Can the recurrence count be reset after unknown results?
2. Can we configure alerts so that they only fire when the status changes from bad to good, not from unknown to good?
3. Is there any way to configure alerts to ignore one or two 'unknown' results, but fire an alert after a 3rd?
4. Is there any way to configure the good alerts similarly?

That last point may need more explanation. Can we configure hostmonitor so that if the status goes good-unknown-good, no alert fires. Yet if the status goes good-unknown-bad an alert fires?

Also, if the status goes good-unknown(x3) can we configure an alert to fire on the 3rd unknown? Can we also configure an alert to fire when the status goes back to good after that?

And if we can do all that, can the recurrence count also work as intelligently?

Thinking as I go, I'm wondering if all this could be achieved by a new test setting 'ignore x unknown events'?

thanks,

Ross
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Re: Counters not resetting but alerts still fire after 'unkn

Post by KS-Soft Europe »

myxiplx wrote:I think I have a few questions off the back of this:
1. Can the recurrence count be reset after unknown results?
Yes. You should disable "Treat Unknown status as Bad" option for the test. With this option enabled, if test results cannot be obtained, actions are triggered by HostMonitor the same way as if the test returned a "Bad" status.
myxiplx wrote:2. Can we configure alerts so that they only fire when the status changes from bad to good, not from unknown to good?
Yes. You have to disable "Treat Unknown status as Bad" option for the test" and use "Action depends on "bad" one" option for the "Good" action.
Quote from the manual:
http://www.ks-soft.net/hostmon.eng/mfra ... pendsOnBad
==================================
This optional parameter is available for "Good" actions only. You can set "Good" action dependable on a "Bad" action. Why do you need it? For example you defined "Bad" action to send an e-mail notification to the network administrator when test fails 3 times consecutively (start when 3 consecutive "Bad" results occur), also you defined «Good» action to send a notification when the test status changes to "Good". What will happen if test fails 1 or 2 times and after this it restores "Good" status? HostMonitor will not send a notification about failure (because test did not fail 3 times) but the program will send notification about restoring "Good" status. To avoid unnecessary "Good" action execution you can mark "Action depends on "bad" one" option and select "Bad" action. In this case HostMonitor will start "Good" action only if corresponding "Bad" action was executed.
==================================
myxiplx wrote:3. Is there any way to configure alerts to ignore one or two 'unknown' results, but fire an alert after a 3rd?
Yes. You have to add "advanced mode" action: http://www.ks-soft.net/hostmon.eng/mfra ... ncedaction
for instance, you may use following expression: ('%SimpleStatus%'=='UNKNOWN') and (%Recurrences%==3)
myxiplx wrote:4. Is there any way to configure the good alerts similarly?
Use "Action depends on "bad" one" option.
myxiplx wrote:That last point may need more explanation. Can we configure hostmonitor so that if the status goes good-unknown-good, no alert fires. Yet if the status goes good-unknown-bad an alert fires?

Also, if the status goes good-unknown(x3) can we configure an alert to fire on the 3rd unknown? Can we also configure an alert to fire when the status goes back to good after that?

And if we can do all that, can the recurrence count also work as intelligently?
So, to achieve the foregoing goals, your profile should contain:
1. One standard "Bad" action
2. One standard "Good" action with enabled "Action depends on "bad" one" option.
3. One advanced action, that should be performed after 3rd unknown status: ('%SimpleStatus%'=='UNKNOWN') and (%Recurrences%==3)

Please note: to make it work, the "Treat Unknown status as Bad" option should be disabled for the test. With this option disabled, the recurrence counters should work as you expected.
myxiplx wrote:Thinking as I go, I'm wondering if all this could be achieved by a new test setting 'ignore x unknown events'?
I do not think we need such settings, because desired behavior could be configured using existing settings.

Regards,
Max
myxiplx
Posts: 21
Joined: Tue Apr 13, 2004 1:53 am

Re: Counters not resetting but alerts still fire after 'unkn

Post by myxiplx »

KS-Soft Europe wrote: You have to disable "Treat Unknown status as Bad" option for the test" and use "Action depends on "bad" one" option for the "Good" action..
Thanks for that. We already had disabled the 'treat unknown status as bad', but I'd missed the 'action depends on' setting on some of our alerts. Even though I've been using Hostmonitor for years, from time to time it still catches me out. It always amazes me just how flexible it is.

Reading through your reply I think I can do what I want quite simply. If I double check all our current tests to make sure they ignore unknowns it's just a case of updating our standard alert profile to have:

Code: Select all

1a.  Bad status action sending an e-mail
1b.  A good action to send an e-mail that depends on 1a. before it fires
2a.  A Bad status action to send an e-mail after 3 unknowns
2b.  A good status action depending on 2a.
And for tests that need it we also have:

Code: Select all

3a.  A bad status action that sends an SMS
3b.  A good status action to send an SMS that depends on 3a.
Which I think solves every question I asked.

Once again, many thanks for the help :D.
myxiplx
Posts: 21
Joined: Tue Apr 13, 2004 1:53 am

Post by myxiplx »

Just one quick extra question:
('%SimpleStatus%'=='UNKNOWN') and (%Recurrences%==3)
Does this Recurrences value count unknowns when the 'Treat unknown as bad' setting is unchecked? Our recurrence counts on the main test list don't seem to include unknown results? I'm just a bit concerned about whether that recurrence test will work and not sure how I could test it.
myxiplx
Posts: 21
Joined: Tue Apr 13, 2004 1:53 am

Post by myxiplx »

It doesn't seem to be possible to set the alert 2b. above. When I go to select "Action depends on bad one", only standard alerts are listed, the new advanced alert isn't available to select.
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

myxiplx wrote:Just one quick extra question:
('%SimpleStatus%'=='UNKNOWN') and (%Recurrences%==3)
Does this Recurrences value count unknowns when the 'Treat unknown as bad' setting is unchecked?
Yes, it count unknowns.
myxiplx wrote:Our recurrence counts on the main test list don't seem to include unknown results? I'm just a bit concerned about whether that recurrence test will work and not sure how I could test it.
When 'Treat unknown as bad' option is disabled, it starts count from 1, when tests changes status from "Bad" to "Unknown" (or from "Unknown" to "Bad"). If the option is enabled, it continue to count from previous recurrence (e.g. if there were 3 "Bad" statuses and status is changed to "Unknown", HostMonitor set the 4 into "Recurrences" field).

Regards,
Max
myxiplx
Posts: 21
Joined: Tue Apr 13, 2004 1:53 am

Post by myxiplx »

KS-Soft Europe wrote:When 'Treat unknown as bad' option is disabled, it starts count from 1, when tests changes status from "Bad" to "Unknown" (or from "Unknown" to "Bad"). If the option is enabled, it continue to count from previous recurrence (e.g. if there were 3 "Bad" statuses and status is changed to "Unknown", HostMonitor set the 4 into "Recurrences" field).
Sweeeeet! I love the way you guys have this program working :)
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

myxiplx wrote:It doesn't seem to be possible to set the alert 2b. above. When I go to select "Action depends on bad one", only standard alerts are listed, the new advanced alert isn't available to select.
You are right. "Action depends on bad one" option works with "standard" actions only. Here you should use "advanced" action. Following expression should work: ('%SimpleStatus%'=='UP') and ('%LastSimpleStatus%'=='UNKNOWN') and (%Recurrences%==1)

Regards,
Max
Post Reply