What about "severity level" ?

Need new test, action, option? Post request here.
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Re: Warning status...

Post by JuergenF »

FLynch wrote:Please please do not loose the core notion and functionality of 'warning' status.

Most monitoring systems have granularity in there status levels - 'warning' brings your attention to a potential problem before it becomes critical.

Very straightforward and will be a significant step forward for AHM.
Exactly what I think

Here are some of my thoughts http://www.ks-soft.net/cgi-bin/phpBB/vi ... 4146#14146
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

AntonyP wrote:I believe that the warning status should not be added, simply because it would be easier to set the alarm trigger at an earlier stage.

E.g. on a disk usage test
hard disk has 2gb free space
I set 1gb free space alarm for HM

Now, what would be the meaning of having the warning status on 1.2bg? I can set the alarm at 1.2gb instead...
The difference is getting a phone call early in the morning for all red statuses.
So there is a need to separate warnings from real problems.
Warning means you have to do something not getting a real problem - but not immediately
genasea
Posts: 28
Joined: Wed Sep 25, 2002 6:00 pm

Similar request

Post by genasea »

Alex,

I had posed this question last year, and it is similar to the requests being asked for here. The vast majority of the tests that we have are performance based tests (not fault based tests). It would be very handy to have a qualification for when these types of tests actually go into alarm or are marked as 'Bad'. Like some of the other users on this forum have pointed out, many of the our IT staff are no longer looking at the alarms since there are so many performance based tests in alarm at any given time. We have about 15 alarms at any given time (mostly preformance based alarms (CPU, Disk time, Page faults, etc), And most are from different servers each minute (so the test is only in alarm during a single test interval), but typically only average a few faults a day.

My thoughts are to add a feature where the person setting up the tests could decide how many tests would need to be in a 'bad' state, prior to setting the test to bad, while keeping the same test interval. For example, if a server's CPU was 100% for 10 minutes in a row (or 10 - test cycles set at 60 second intervals), then set the test to 'bad' rather than just go bad the first time it gets a 100% test result.

I know that you stated that this would require additional coding to the core functionality. However, my organization is thinking on moving away from HM, and our two licenses, because they feel the product does not deal effectively with performance based tests.

Thank you for your consideration,

Scott
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Probably we should keep "basic" scheme as is and provide ability to set additional statuses using expressions. Just like we did with "standard" and "advanced" actions.
E.g. implement 2 new statuses and 2 options
[x] Use expression to set Warning status
[ ] Use expression to set Normal status
So you will be able to use expressions lke ('%Reply%'>'70 %') and ('%Reply%'<'90 %') and ('%MainRouter::SimpleStatus%'=='UP')
Warning/Normal statuses will be handled just like other bad/good statuses for statistics purposes. But such items can be displayed in different color, HostMonitor may apply different sorting order, generate separate HTML reports.
This way we keep "basic" setup simple enough and provide great flexibility when you really need that.

Regards
Alex
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

Dear Alex,

that sounds good to me.

- and it will be possible to set the test to bad after the 5th test between 70 and 90 % CPU ?
- So we can have an HTML report only showing warning and bad tests in different colours ?
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

- and it will be possible to set the test to bad after the 5th test between 70 and 90 % CPU ?

H'm..
- expression like "('%Reply%'>70 %') and ('%Reply%'<=90 %')" will set Warning status when CPU Usage between 70 and 90 % (Bad status if CPU Usage over 90%)
- expression like "('%SimpleStatus%'=='DOWN') and (%Recurrences<5)" will set Warning status for 1st..4th failed probe (5th failed probe will use Bad status)
- you may combine condition, e.g. "('%Reply%'>70 %') and ('%Reply%'<=90 %') and ('%SimpleStatus%'=='DOWN') and (%Recurrences<5).
But its impossible to combine in your way (HostMonitor does not have history for all previous Reply values, except log of course). Unless Warning status resets Recurrences. In such case we will need to redesign actions related behaviour (don't really want to do that until version 8 or something).
- So we can have an HTML report only showing warning and bad tests in different colours ?
Sure

Regards
Alex
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Done. Version 6.50 Beta available at http://www.ks-soft.net/hostmon.eng/downpage.htm
What's new: http://www.ks-soft.net/hostmon.eng/news.htm

Regards
Alex
FLynch
Posts: 75
Joined: Tue Jun 18, 2002 6:00 pm
Location: London UK

Post by FLynch »

Downloaded and tried this functionality out and it is spot on - works well.

Many thanks for introducing this, it takes AHM to a new level!

Cheers
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

You are welcome :)

Regards
Alex
Post Reply