NORMAL Status and %Recurrences%

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

How can I get the following behaviour:
If the test fails for the first 3 times, it gets the status "NORMAL", from the 4th failed test onwards it gets the status "bad".
You cannot. You may use Normal status for 1st and 2nd failed probe. We discuss about such setup in near by topic http://www.ks-soft.net/cgi-bin/phpBB/vi ... 3453#14901
Unless we implement new counters: %WarningStatusRecurrencies% and %NormalStatusRecurrencies% (the same topic 3453)

Regards
Alex
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

Hi Alex,
KS-Soft wrote:Unless we implement new counters: %WarningStatusRecurrencies% and %NormalStatusRecurrencies%
that sounds as if you agree :D

Please think about the following.
Even with that counter, we have to handle the first change from "Ok" to "Normal" other than the next Recurrencies.
Similar to this

[x] Use "Normal" status if: ('%SuggestedStatus%'=='Bad') and (('%Status%'=='Ok') or (('%Status%'=='Normal') and ('%LastStatus%'=='Ok')))

That is complicated and the expression should be as easy as possible.
Do you see a way to "code" the information of the 1st time issue ('%SuggestedStatus%'=='Bad' and '%Status%'=='Ok') into the new counters ?

Let's say
- %NormalStatusRecurrencies% is always set to 1 when Status = Ok
- %NormalStatusRecurrencies% is incrementated by +1 after Status is set to Normal
- %WarningStatusRecurrencies% is always set to 1 when Status = Ok
- %WarningStatusRecurrencies% is incrementated by +1 after Status is set to Warning

[x] Use "Warning" status if: ("%SuggestedSimpleStatus%"=="DOWN") and (%WarningStatusRecurrencies%<4)
[x] Use "Normal" status if: ("%SuggestedSimpleStatus%"=="DOWN") and (%NormalStatusRecurrencies%<5)

The result should be:
Ok........... -> Normal -> Normal -> Normal -> Normal -> Warning -> Warning -> Warning -> Bad.... -> Bad..... -> Ok
N=1,W=1 -> N1, W1 -> N2,W1 -> N3,W1 -> N4, W1 -> N5 , W1 .-> N5 , W2 .-> N5 , W3 -> N5,W4 -> N5,W4 -> N1,W1

Maybe I missed something ?
thomasschmeidl
Posts: 166
Joined: Sat Apr 15, 2006 2:14 pm
Location: Germany, Bavaria

Post by thomasschmeidl »

I can see clearly now ...

@Jürgen

Thanx for your explanations

I agree very much that the expressions should be as simple and comprehensible as possible

@Alex

Maybe one new variable should be enough: Call it %SuggestedSimpleStatusBadRecurrencies% (=SSSBR) or something like that. It has to count the number of consecutive tests, whrere the %SuggestedSimpleStatus% is bad.

@Alex and Juergen

So you could setup Juergen's Example as follows

[x] Use "Warning" status if: (%SuggestedSimpleStatusBadRecurrencies%<8 )
[x] Use "Normal" status if: (%SuggestedSimpleStatusBadRecurrencies%<5)

As the normal condition is processed after the warning condition the status will always be "normal" if both conditions are true

The result should be:
Ok........... -> Normal -> Normal -> Normal -> Normal -> Warning -> Warning -> Warning -> Bad.... -> Bad..... -> Ok
SSSBR=0 - SSSBR=1 - SSSBR=2 - SSSBR=3 - SSSBR=4 - SSSBR=5 - SSSBR=6 - SSSBR=7 - SSSBR=8 - SSSBR=9 - SSSBR=0

Did I miss something?

Thomas
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Maybe one new variable should be enough: Call it %SuggestedSimpleStatusBadRecurrencies% (=SSSBR) or something like that. It has to count the number of consecutive tests, whrere the %SuggestedSimpleStatus% is bad
H'm.. cannot say I like that
Unless we implement new counters: %WarningStatusRecurrencies% and %NormalStatusRecurrencies%
..
that sounds as if you agree
Almost. We will not change code right now, want to receive some response from other customers.

Regards
Alex
thomasschmeidl
Posts: 166
Joined: Sat Apr 15, 2006 2:14 pm
Location: Germany, Bavaria

Post by thomasschmeidl »

Hi Alex,

I agree that more customers than Juergen and me should be asked or should have the opportunity to make proposals.

My considerations were that the condition "set status to normal for the first m bad test results" resp. "set status to warning for the first n bad test results" obviously are standard conditions many customers will use.

So admin work would be made simple having ONE simple variable covering exactly these conditions (all other variable need more complex expressions)

I admit that SSSBR is not a very handy name - maybe someone finds a better one

Cheers

Thomas
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

thomasschmeidl wrote:My considerations were that the condition "set status to normal for the first m bad test results" resp. "set status to warning for the first n bad test results" obviously are standard conditions many customers will use.
I agree - as simple as possible for the most common tasks
Dubolomov
Posts: 214
Joined: Thu Jun 01, 2006 10:27 am
Location: Russia

Post by Dubolomov »

thomasschmeidl wrote:Hi Alex,

I agree that more customers than Juergen and me should be asked or should have the opportunity to make proposals.

My considerations were that the condition "set status to normal for the first m bad test results" resp. "set status to warning for the first n bad test results" obviously are standard conditions many customers will use.

So admin work would be made simple having ONE simple variable covering exactly these conditions (all other variable need more complex expressions)

I admit that SSSBR is not a very handy name - maybe someone finds a better one

Cheers

Thomas
+1
thomasschmeidl
Posts: 166
Joined: Sat Apr 15, 2006 2:14 pm
Location: Germany, Bavaria

Post by thomasschmeidl »

Hi everybody,

at least I'm not the only one asking for simple solutions.

Maybe it helps to explain my considerations and suggestions a little more:

Obviously there are two main conditions we will use for the new NORMAL and WARNING statuses.

a) if recurrences of bad results exceed a certain number
b) if reply exceeds a certain threshold

Condition a) is useful for most tests (except SNMP trap test, unless recurrences are processed as suggested in http://www.ks-soft.net/cgi-bin/phpBB/vi ... php?t=3011)
Condition b) is mainly useful for tests which produce a numeric reply.

Examples:
a)a): CPU test, threshold 90%
- NORMAL if <4 bad recurrences;
- WARNING if <10 bad recurrences;
- BAD if >=10 bad recurrences

a)b): Ping test, threshold 1000ms
- NORMAL if <2 bad recurrences;
- WARNING if (1000ms >= Reply time >200ms) and >2 recurrencies;
- BAD if Reply time > 1000ms and >2 recurrencies

b)b): Disk Space test threshold 10%
- NORMAL if 10% < space <20%;
- WARNING if 10% >= Space >5%.
- BAD if Space <=5%

That's why I imagine an advanced status processing dialogue similar to the action properties dialogue giving a choice (in a list box) between four modes as follows:

Condition to set WARNING resp. NORMAL status:

- DO NOT SET THIS STATUS
- RECURRENCE MODE: set status if recurrences of bad result exceed [ ]
- THRESHOLD MODE: set status if reply value is in the range between test threshold and [ ]
- EXPRESSION MODE: set status if the following expression is fulfilled: [............]

Recurrence mode and Threshold mode are designed to make admin work VERY easy; nevertheless it would be possible to define special conditions in the Expression mode!

Whereas the conditions and variables necessary for Recurrence Mode were already discussed in this forum let me add a few more considerations to the threshold mode:

IMHO it should be sufficient to define one threshold for each status - the second threshold can always be the threshold defined in the test properties dialogue. This avoids redundant information.
IMHO it is not necessary to declare a < or > for this threshold. The operator depends on whether the test threshold is < or > th status threshold.

Hope I don't make you crazy

Cheers

Thomas
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

Alex may argue that you can define all of that settings using the "EXPRESSION MODE".

But what you can see reading this article is how tricky that can be.
thomasschmeidl wrote: a)b): Ping test, threshold 1000ms
- NORMAL if <2 bad recurrences;
- WARNING if (1000ms >= Reply time >200ms) and >2 recurrencies;
- BAD if Reply time > 1000ms and >2 recurrencies
If recurrences = 2, that will lead to OK -> that is not correct
But it's not easy to see if you have missed some combinations

Therefore I support Thomas' idea to implement easy dialogues for common tasks.

Please think about that Alex

Regards

Juergen
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Alex may argue that you can define all of that settings using the "EXPRESSION MODE".
Yes, I don't want to implement new options just to replace simple expression. And we need to keep expression mode anyway :roll:
New variables - its Ok.
Therefore I support Thomas' idea to implement easy dialogues for common tasks
What about sample expressions in drop down list?

Regards
Alex
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

KS-Soft wrote:And we need to keep expression mode anyway :roll:
Yes, please
KS-Soft wrote:What about sample expressions in drop down list?
Can you give some examples.
The problem I see is - as I wrote - that it's tricky to think of all possible cases and combinations when using expressions.

Regards
Juergen
thomasschmeidl
Posts: 166
Joined: Sat Apr 15, 2006 2:14 pm
Location: Germany, Bavaria

Post by thomasschmeidl »

Hi Alex,

I can understand your considerations and I agree with Juergen

We definitely need the expression mode as "advanced mode" for complex conditions. It is most versatile. But IMHO it is even more important to provide an easy and error proof way to define "Recurrences" and "Thresholds".

In the action properties dialogue you provide standard mode and advanced mode, too (although standard mode could be replaced by an advanced mode expression).

But in most settings you need few actions (we have three profiles: low / medium / high prio, each with 4 tor 7 actions) but many tests (we have nearly 1000) which could benefit from the modes discussed.

The problem Juergen and me want to point out is that it is quite easy to thoroughly test a few action profiles but it is much more difficult to check the expressions in hundreds of tests.

Cheers and Merry Christmas

Thomas


PS: IMHO HM is the most useful and versatile tool in this market segment -and we want to help you make it even more useful and versatile ;-)
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

But IMHO it is even more important to provide an easy and error proof way to define "Recurrences" and "Thresholds".
I agree. But I think this should be done with help of new variables.
E.g.
- %CurrentStatusIteration% variable that should be resetted by any status change (even Ok<->Normal). I think it can be used instead of %WarningStatusRecurrences% and %NormalStatusRecurrences%
- also %SSSBR% has sense, I just don't like the name and cannot think out something better. %FailedIteration%, %DownIteration%, %FailureNo% :roll:
In the action properties dialogue you provide standard mode and advanced mode, too (although standard mode could be replaced by an advanced mode expression).
By analogue: "use Normal" and "use Warning" options are disabled - this is standard test item.
When options are enabled - this is advanced test item.
The problem Juergen and me want to point out is that it is quite easy to thoroughly test a few action profiles but it is much more difficult to check the expressions in hundreds of tests.
H'm.. yes, but I am afraid everybody will request more and more checkboxes for their needs instead of using versatile expressions.

What if you specify expressions using global macro variables, assign some simple names like "udv_WarningFor3Failures", "udv_NormalBelow3GB" and then use single variable %udv_NormalBelow3GB% as expression.
I think its pretty simple solution and it will be easy to check hundreds of tests.
Cheers and Merry Christmas
Merry Christmas and Happy New Year :)

Regards
Alex
JuergenF
Posts: 331
Joined: Sun Jan 26, 2003 6:00 pm
Location: Germany, North Rhine-Westphalia

Post by JuergenF »

Hi Alex,

I can follow a lot of your arguements and I think it's worth trying.

So you are thinking about a new Variable %CurrentStatusIteration%
I think %CurrentStatusIteration% is not available for the first test with BAD result ?!
So: ("%SuggestedSimpleStatus%" = "DOWN" and %CurrentStatusIteration% < 4) will work or not ?
Or do we have to use %SuggestedRecurrences%
Can you please tell us how the following examples from Thomas may look like as expressions.

Code: Select all

Obviously there are two main conditions we will use for the new NORMAL and WARNING statuses. 

a) if recurrences of bad results exceed a certain number 
b) if reply exceeds a certain threshold 

Condition a) is useful for most tests (except SNMP trap test, unless recurrences are processed as suggested in http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=3011) 
Condition b) is mainly useful for tests which produce a numeric reply. 

Examples: 
a)a): CPU test, threshold 90% 
- NORMAL if <4 bad recurrences; 
- WARNING if <10 bad recurrences; 
- BAD if >=10 bad recurrences

a)b): Ping test, threshold 1000ms 
- NORMAL if <=2 bad recurrences; 
- WARNING if (1000ms >= Reply time >200ms) and >2 recurrencies; 
- BAD if Reply time > 1000ms and >2 recurrencies

b)b): Disk Space test threshold 10% 
- NORMAL if 10% < space <20%; 
- WARNING if 10% >= Space >5%. 
- BAD if Space <=5%
Many thanks for your help

Juergen
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

a): CPU test, threshold 90%
- NORMAL if <4 bad recurrences;
- WARNING if <10 bad recurrences;
- BAD if >=10 bad recurrences
- Use Warning if (%SSSBR%>3) and (%SSSBR%<10)
- Use Normal if (%SSSBR%<4)
b): Ping test, threshold 1000ms
- NORMAL if <=2 bad recurrences;
- WARNING if (1000ms >= Reply time >200ms) and >2 recurrencies;
- BAD if Reply time > 1000ms and >2 recurrencies
This item raises many questions, especially Warning condition. What is "recurrencies" here? How many times reply returns value between 200 and 1000ms? What status should be used when response time shows 1100, 300, 300? Bad? Warning? Ok? What about 300, 1100, 300?
I think the only solution in this case
- you should describe all possible reply sequences and status changes
- write your own monitoring program
- store entire history of the tests using some database
- provide access to historical data (status, reply, recurrences)
- and write your own algorithm for such test items.
Why? Because such algorithm cannot be described by one expression.
c): Disk Space test threshold 10%
- NORMAL if 10% < space <20%;
- WARNING if 10% >= Space >5%.
- BAD if Space <=5%
1) set threshold to 5%
2) use Warning if (%Reply%<=10) and (%Reply%>5); use Normal if (%Reply%<20) and (%Reply%>10)

Regards
Alex
Post Reply