Method to do a 'second knock' as in ServersAlive
Method to do a 'second knock' as in ServersAlive
Alex,
I see from the archives that someone already asked for this. Being a former ServersAlive user it was a very useful feature. Here's how it worked (I think)
When SAlive carried out the test if there was no response rather than immediately indicating there was "No Answer" it marked the test and then at the end of the checking cycle it cheked it again before setting test status. This was great when checking remote links because by checking it 30 seconds later (i.e. after all other tests were completed) there was a strong chance that the link would test "UP" due to possible traffic at that site.
I now use RECURRENCES==2 but its not the same.
Any thoughts on how this feature could be added to HM?
I see from the archives that someone already asked for this. Being a former ServersAlive user it was a very useful feature. Here's how it worked (I think)
When SAlive carried out the test if there was no response rather than immediately indicating there was "No Answer" it marked the test and then at the end of the checking cycle it cheked it again before setting test status. This was great when checking remote links because by checking it 30 seconds later (i.e. after all other tests were completed) there was a strong chance that the link would test "UP" due to possible traffic at that site.
I now use RECURRENCES==2 but its not the same.
Any thoughts on how this feature could be added to HM?
HostMonitor provides "Repeat test" action instead:
http://www.ks-soft.net/hostmon.eng/mfra ... #actRepeat
Regards
Alex
http://www.ks-soft.net/hostmon.eng/mfra ... #actRepeat
Regards
Alex
Defaults? Do you mean "start" and "repeat" parameters? If you want to re-check item 1 time after 1st failure, use the following parameters:Would i just accept the defaults (when adding this to the profile) to force HM to check again?
- Start when 1 consecutive "bad" results occur
- Repeat 1 time
Are you asking about tests or actions?When I do add this to the test it is listed before the other action, does the display order determine the order of the test?
The "Sort by" list defines the order in which test items will be arranged. See Columns page in the Options dialog.
Order of actions execution depends on "Start when" parameters. If some actions have the same value of the "start when" parameter, these actions will be executed in creation order.
Regards
Alex
I tested the "REPEAT" option but I don't think it does what I want (apologies if I am wrong). This is what appears to happen with HM when using REPEAT
Using a ping of a remote link as an example:
On the first ping failure HM identifies that the test is bad (i.e. recurrence =1)
When it does the REPEAT operation (and lets say the link is still down) REPEAT then indicates that the test is BAD (i.e recurrence = 2)
To mimic 2nd knock, HM should NOT decide whether the test is bad until REPEAT has been carried out. Only when the REPEAT action has taken place should the result be decided. In my senario with "Start When = 2" using REPEAT does not help.
Hope that makes sense?
Using a ping of a remote link as an example:
On the first ping failure HM identifies that the test is bad (i.e. recurrence =1)
When it does the REPEAT operation (and lets say the link is still down) REPEAT then indicates that the test is BAD (i.e recurrence = 2)
To mimic 2nd knock, HM should NOT decide whether the test is bad until REPEAT has been carried out. Only when the REPEAT action has taken place should the result be decided. In my senario with "Start When = 2" using REPEAT does not help.
Hope that makes sense?
I agree,
There should be a method to allow a test to wait a couple of tries before turning red.
In other monitor programs I have used at different company's the device being monitored turned yellow. Until the retries for that device is met then turn red
There are several cases for this, there are several people who look at the web monitor list. As soon as something turns red from some delay somewhere they assume there is a problem. Even my own techs have to take time to determine if it is a true alert or just a false alert due to delays.
Adjusting timeouts is not enough, we move gigabytes of data a day across our lines and sometimes pings or tests may not make it for couple of times.
I have tried to overcome with setting the email options too, but that gets messy with a lot of email alerts setup. And that monitored device still turns red in the monitor screen.
There should be a method to allow a test to wait a couple of tries before turning red.
In other monitor programs I have used at different company's the device being monitored turned yellow. Until the retries for that device is met then turn red
There are several cases for this, there are several people who look at the web monitor list. As soon as something turns red from some delay somewhere they assume there is a problem. Even my own techs have to take time to determine if it is a true alert or just a false alert due to delays.
Adjusting timeouts is not enough, we move gigabytes of data a day across our lines and sometimes pings or tests may not make it for couple of times.
I have tried to overcome with setting the email options too, but that gets messy with a lot of email alerts setup. And that monitored device still turns red in the monitor screen.
How about a option that - instead of changing the entire logic - simply (if that be said) allows me to check a box that says "ignore result of first <n> consecutive bad results" just to keep the statistics in order? Unless that's exactly what you were thinking and that's also exactly way too complicated to implement at this time...
.
Maybe instead an action (that keeps track) and just says "ignore last test", as if it never happened, except that a seperate variable keeps track just so the action profile knows when to "stop" ignoring the results.
Hope something comes of this. Although I understand either solution here may be quite difficult to implement.
Note: <-- not the same user as above

Maybe instead an action (that keeps track) and just says "ignore last test", as if it never happened, except that a seperate variable keeps track just so the action profile knows when to "stop" ignoring the results.
Hope something comes of this. Although I understand either solution here may be quite difficult to implement.
Note: <-- not the same user as above
Hi Alex,KS-Soft wrote:We plan to implement new "Warning" status and some Test-level option like "Set Bad status after N failed probes"
Regards
Alex
I'm waiting for both options:

1) Having more states of tests (at least a warning)
2) Having something similar to the "second knock" function in ServersAlive.
Please let me give some of my thoughts:
1) When you are working on that please think about more than adding a "warning" status. User defined states (or is it status or statuses ??) maybe a good thing.
I can think of:
OK - Information - Notification - Warning - Error - Critical - Alert - Emergency
http://www.cisco.com/en/US/products/hw/ ... ml#1015181
See: Table 1-2 Message Severity Levels
2) "Set Bad status after N failed probes" maybe a good idea.
On the other hand you already have a "retries" field in some tests (SNMP, Trace, RAS, UDP, NTP, RADIUS, Traffic Monitor). Why not in all tests ?
But I don't know exactly how that works and if that is similar to "second knock" in ServersAlive. (first doing the retries and after that setting the Test-Status)
Best regards and "Kudos to you!" for your great work.
Juergen
EDIT:
Just another question to 1)
Will it be possible to define in one test the following:
After x fails set Status to: Notification
After y fails set Status to: Warning
After z fails set Status to: Error
In other words: Escalate the Status within one test depending on the number of fails ?