Method to do a 'second knock' as in ServersAlive

Colinl · Post by **Colinl** » Sun Jul 17, 2005 3:57 pm

Alex,

I see from the archives that someone already asked for this. Being a former ServersAlive user it was a very useful feature. Here's how it worked (I think)

When SAlive carried out the test if there was no response rather than immediately indicating there was "No Answer" it marked the test and then at the end of the checking cycle it cheked it again before setting test status. This was great when checking remote links because by checking it 30 seconds later (i.e. after all other tests were completed) there was a strong chance that the link would test "UP" due to possible traffic at that site.

I now use RECURRENCES==2 but its not the same.

Any thoughts on how this feature could be added to HM?

KS-Soft · Post by **KS-Soft** » Sun Jul 17, 2005 10:20 pm

HostMonitor provides "Repeat test" action instead:
http://www.ks-soft.net/hostmon.eng/mfra ... #actRepeat

Regards
Alex

Colinl · Post by **Colinl** » Mon Jul 18, 2005 2:46 am

Thanks.

Would i just accept the defaults (when adding this to the profile) to force HM to check again? When I do add this to the test it is listed before the other action, does the display order determine the order of the test?

KS-Soft · Post by **KS-Soft** » Mon Jul 18, 2005 12:12 pm

Would i just accept the defaults (when adding this to the profile) to force HM to check again?

Defaults? Do you mean "start" and "repeat" parameters? If you want to re-check item 1 time after 1st failure, use the following parameters:
- Start when 1 consecutive "bad" results occur
- Repeat 1 time

When I do add this to the test it is listed before the other action, does the display order determine the order of the test?

Are you asking about tests or actions?
The "Sort by" list defines the order in which test items will be arranged. See Columns page in the Options dialog.

Order of actions execution depends on "Start when" parameters. If some actions have the same value of the "start when" parameter, these actions will be executed in creation order.

Regards
Alex

Colinl · Post by **Colinl** » Wed Jul 20, 2005 7:35 pm

I tested the "REPEAT" option but I don't think it does what I want (apologies if I am wrong). This is what appears to happen with HM when using REPEAT

Using a ping of a remote link as an example:

On the first ping failure HM identifies that the test is bad (i.e. recurrence =1)
When it does the REPEAT operation (and lets say the link is still down) REPEAT then indicates that the test is BAD (i.e recurrence = 2)

To mimic 2nd knock, HM should NOT decide whether the test is bad until REPEAT has been carried out. Only when the REPEAT action has taken place should the result be decided. In my senario with "Start When = 2" using REPEAT does not help.

Hope that makes sense?

KS-Soft · Post by **KS-Soft** » Thu Jul 21, 2005 11:35 am

HostMonitor does not follow Server Alive logic, it works differently.
If you need more releable ping check, I think you need to increase timeout and/or number of packets

Regards
Alex

StevenE · Post by **StevenE** » Thu Jul 28, 2005 9:34 am

I agree,

There should be a method to allow a test to wait a couple of tries before turning red.

In other monitor programs I have used at different company's the device being monitored turned yellow. Until the retries for that device is met then turn red

There are several cases for this, there are several people who look at the web monitor list. As soon as something turns red from some delay somewhere they assume there is a problem. Even my own techs have to take time to determine if it is a true alert or just a false alert due to delays.

Adjusting timeouts is not enough, we move gigabytes of data a day across our lines and sometimes pings or tests may not make it for couple of times.

I have tried to overcome with setting the email options too, but that gets messy with a lot of email alerts setup. And that monitored device still turns red in the monitor screen.

KS-Soft · Post by **KS-Soft** » Thu Jul 28, 2005 6:19 pm

I see your point. However this simple modification may have a lot of side effects and we will need to check all modules and may be change some of them. That's why we will not do such thing in version 5

Regards
Alex

jromariz · Post by **jromariz** » Thu Jun 01, 2006 6:47 am

Alex,

Any news about the issue?

Regards,

Jromariz.

KS-Soft · Post by **KS-Soft** » Thu Jun 01, 2006 10:34 am

Sorry, no news

Regards
Alex

Steven · Post by **Steven** » Thu Jun 29, 2006 3:07 pm

How about a option that - instead of changing the entire logic - simply (if that be said) allows me to check a box that says "ignore result of first <n> consecutive bad results" just to keep the statistics in order? Unless that's exactly what you were thinking and that's also exactly way too complicated to implement at this time...

.

Maybe instead an action (that keeps track) and just says "ignore last test", as if it never happened, except that a seperate variable keeps track just so the action profile knows when to "stop" ignoring the results.

Hope something comes of this. Although I understand either solution here may be quite difficult to implement.

Note: <-- not the same user as above

KS-Soft · Post by **KS-Soft** » Thu Jun 29, 2006 8:21 pm

We plan to implement new "Warning" status and some Test-level option like "Set Bad status after N failed probes"

Regards
Alex

StevenE · Post by **StevenE** » Wed Aug 02, 2006 3:09 pm

Alex,

Any idea when this would be implemented ?

Thanks
StevenE

KS-Soft · Post by **KS-Soft** » Thu Aug 03, 2006 9:08 am

May be in version 6.50

Regards
Alex

JuergenF · Post by **JuergenF** » Sat Oct 21, 2006 3:53 am

KS-Soft wrote:We plan to implement new "Warning" status and some Test-level option like "Set Bad status after N failed probes"

Regards
Alex

Hi Alex,

I'm waiting for both options:

1) Having more states of tests (at least a warning)
2) Having something similar to the "second knock" function in ServersAlive.

Please let me give some of my thoughts:
1) When you are working on that please think about more than adding a "warning" status. User defined states (or is it status or statuses ??) maybe a good thing.
I can think of:
OK - Information - Notification - Warning - Error - Critical - Alert - Emergency
http://www.cisco.com/en/US/products/hw/ ... ml#1015181
See: Table 1-2 Message Severity Levels

2) "Set Bad status after N failed probes" maybe a good idea.
On the other hand you already have a "retries" field in some tests (SNMP, Trace, RAS, UDP, NTP, RADIUS, Traffic Monitor). Why not in all tests ?
But I don't know exactly how that works and if that is similar to "second knock" in ServersAlive. (first doing the retries and after that setting the Test-Status)

Best regards and "Kudos to you!" for your great work.

Juergen

EDIT:
Just another question to 1)

Will it be possible to define in one test the following:

After x fails set Status to: Notification
After y fails set Status to: Warning
After z fails set Status to: Error

In other words: Escalate the Status within one test depending on the number of fails ?