View previous topic :: View next topic |
Author |
Message |
JuergenF
Joined: 26 Jan 2003 Posts: 331 Location: Germany, North Rhine-Westphalia
|
Posted: Sun Nov 12, 2006 4:32 pm Post subject: Re: Warning status... |
|
|
FLynch wrote: | Please please do not loose the core notion and functionality of 'warning' status.
Most monitoring systems have granularity in there status levels - 'warning' brings your attention to a potential problem before it becomes critical.
Very straightforward and will be a significant step forward for AHM. |
Exactly what I think
Here are some of my thoughts http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?p=14146#14146 |
|
Back to top |
|
|
JuergenF
Joined: 26 Jan 2003 Posts: 331 Location: Germany, North Rhine-Westphalia
|
Posted: Sun Nov 12, 2006 4:55 pm Post subject: |
|
|
AntonyP wrote: | I believe that the warning status should not be added, simply because it would be easier to set the alarm trigger at an earlier stage.
E.g. on a disk usage test
hard disk has 2gb free space
I set 1gb free space alarm for HM
Now, what would be the meaning of having the warning status on 1.2bg? I can set the alarm at 1.2gb instead... |
The difference is getting a phone call early in the morning for all red statuses.
So there is a need to separate warnings from real problems.
Warning means you have to do something not getting a real problem - but not immediately |
|
Back to top |
|
|
genasea
Joined: 25 Sep 2002 Posts: 27
|
Posted: Mon Nov 13, 2006 8:39 am Post subject: Similar request |
|
|
Alex,
I had posed this question last year, and it is similar to the requests being asked for here. The vast majority of the tests that we have are performance based tests (not fault based tests). It would be very handy to have a qualification for when these types of tests actually go into alarm or are marked as 'Bad'. Like some of the other users on this forum have pointed out, many of the our IT staff are no longer looking at the alarms since there are so many performance based tests in alarm at any given time. We have about 15 alarms at any given time (mostly preformance based alarms (CPU, Disk time, Page faults, etc), And most are from different servers each minute (so the test is only in alarm during a single test interval), but typically only average a few faults a day.
My thoughts are to add a feature where the person setting up the tests could decide how many tests would need to be in a 'bad' state, prior to setting the test to bad, while keeping the same test interval. For example, if a server's CPU was 100% for 10 minutes in a row (or 10 - test cycles set at 60 second intervals), then set the test to 'bad' rather than just go bad the first time it gets a 100% test result.
I know that you stated that this would require additional coding to the core functionality. However, my organization is thinking on moving away from HM, and our two licenses, because they feel the product does not deal effectively with performance based tests.
Thank you for your consideration,
Scott |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12793 Location: USA
|
Posted: Mon Nov 13, 2006 1:12 pm Post subject: |
|
|
Probably we should keep "basic" scheme as is and provide ability to set additional statuses using expressions. Just like we did with "standard" and "advanced" actions.
E.g. implement 2 new statuses and 2 options
[x] Use expression to set Warning status
[ ] Use expression to set Normal status
So you will be able to use expressions lke ('%Reply%'>'70 %') and ('%Reply%'<'90 %') and ('%MainRouter::SimpleStatus%'=='UP')
Warning/Normal statuses will be handled just like other bad/good statuses for statistics purposes. But such items can be displayed in different color, HostMonitor may apply different sorting order, generate separate HTML reports.
This way we keep "basic" setup simple enough and provide great flexibility when you really need that.
Regards
Alex |
|
Back to top |
|
|
JuergenF
Joined: 26 Jan 2003 Posts: 331 Location: Germany, North Rhine-Westphalia
|
Posted: Wed Nov 15, 2006 12:43 am Post subject: |
|
|
Dear Alex,
that sounds good to me.
- and it will be possible to set the test to bad after the 5th test between 70 and 90 % CPU ?
- So we can have an HTML report only showing warning and bad tests in different colours ? |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12793 Location: USA
|
Posted: Wed Nov 15, 2006 8:54 pm Post subject: |
|
|
Quote: | - and it will be possible to set the test to bad after the 5th test between 70 and 90 % CPU ? |
H'm..
- expression like "('%Reply%'>70 %') and ('%Reply%'<=90 %')" will set Warning status when CPU Usage between 70 and 90 % (Bad status if CPU Usage over 90%)
- expression like "('%SimpleStatus%'=='DOWN') and (%Recurrences<5)" will set Warning status for 1st..4th failed probe (5th failed probe will use Bad status)
- you may combine condition, e.g. "('%Reply%'>70 %') and ('%Reply%'<=90 %') and ('%SimpleStatus%'=='DOWN') and (%Recurrences<5).
But its impossible to combine in your way (HostMonitor does not have history for all previous Reply values, except log of course). Unless Warning status resets Recurrences. In such case we will need to redesign actions related behaviour (don't really want to do that until version 8 or something).
Quote: | - So we can have an HTML report only showing warning and bad tests in different colours ? |
Sure
Regards
Alex |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12793 Location: USA
|
|
Back to top |
|
|
FLynch
Joined: 18 Jun 2002 Posts: 75 Location: London UK
|
Posted: Sat Dec 09, 2006 11:53 am Post subject: |
|
|
Downloaded and tried this functionality out and it is spot on - works well.
Many thanks for introducing this, it takes AHM to a new level!
Cheers |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12793 Location: USA
|
Posted: Sat Dec 09, 2006 12:54 pm Post subject: |
|
|
You are welcome
Regards
Alex |
|
Back to top |
|
|
|