View previous topic :: View next topic |
Author |
Message |
stesan100
Joined: 18 Jun 2009 Posts: 5
|
Posted: Sat Oct 27, 2012 3:35 pm Post subject: Log Test: count # of bad records |
|
|
In our system, we have known conditions where an error occurs occasionally, and it is expected. We don't want to have the status be Bad for a single occurrence, or the test would fail 100% of the time.
We want the ability to error if we get TOO MANY of those errors during a test interval.
For instance, it would be great if a log test status could only become bad if we have > 5 errors in the log over the test interval. If we have 4 or fewer errors, status remains Good.
The test Reply value would equal the number of times the test string has appeared in the log. |
|
Back to top |
|
|
KS-Soft Europe
Joined: 16 May 2006 Posts: 2832
|
|
Back to top |
|
|
stesan100
Joined: 18 Jun 2009 Posts: 5
|
Posted: Mon Oct 29, 2012 7:37 pm Post subject: |
|
|
Let me make sure I understand this right with an example:
* I have a Text Log test which searches for the string "ERROR" in the log. This test runs every 5 minutes.
* I set up the conditional "Use Normal status if" and specify
('%FailureIteration%' < 20)
* In one 5-minute span, the word "ERROR" appears 15 times. The test shows normal status.
* In another 5-minute span, the word "ERROR" appears 24 times. The test shows Bad status.
Is this correct? I always assumed FailureIteration meant number of consecutive test failures, not number of bad records in the log. |
|
Back to top |
|
|
KS-Soft Europe
Joined: 16 May 2006 Posts: 2832
|
Posted: Tue Oct 30, 2012 12:10 pm Post subject: |
|
|
Correct, %FailureIteration% means number of consecutive failed test probes.
There is no variable that holds number of Bad records in Log.
However you may use %FailureIteration% as workarount to count number of Bad records in Log.
Quote: | * In one 5-minute span, the word "ERROR" appears 15 times. The test shows normal status.
* In another 5-minute span, the word "ERROR" appears 24 times. The test shows Bad status. |
In such case you will need the following configuration:
1. Set Warn of "all new events" option on Test properties dialog for Text Log test.
2. Set "Use Normal status if" option with expression like:
('%FailureIteration%'<20) AND ('%FailureIteration%'>0)
3. Setup additional action "Repeat test", select "advanced mode" and provide expression like the following:
('%SimpleStatus%'=='DOWN') OR ('%Status%'=='Normal')
With these settings you will get:
- if there were 15 Bad records within 5 minutes (Test Interval), HostrMonitor will not set Bad status (Normal and Ok statuses will be used).
- if there were 20 or more Bad records within 5 minutes (Test Interval), HostrMonitor will set Bad status and trigger assigned actions. |
|
Back to top |
|
|
mp1
Joined: 07 Mar 2006 Posts: 200
|
Posted: Wed Oct 14, 2015 8:58 am Post subject: |
|
|
Hi,
I would have the same request and checked it with your Suggestion.
Will be the repat test Action always be executed by the Hostmonitor?
The test it self will be executed by the RMA Agent (Linux).
I will get the following error: "RMA: Wrong Command"
I would need the possiblity to cound an Expression within a log file ... (Linux)
Thank in advance
Martin |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12807 Location: USA
|
Posted: Wed Oct 14, 2015 9:58 am Post subject: |
|
|
Quote: | Will be the repat test Action always be executed by the Hostmonitor? |
HostMonitor sets test execution time, so yes, this action tells HostMonitor to schedule test execution.
Test will be executed by specified agent (HostMonitor or RMA) in any case.
Quote: | The test it self will be executed by the RMA Agent (Linux).
I will get the following error: "RMA: Wrong Command" |
HostMonitor version?
RMA version?
Test method? Text Log?
Regards
Alex |
|
Back to top |
|
|
mp1
Joined: 07 Mar 2006 Posts: 200
|
Posted: Thu Oct 15, 2015 8:29 am Post subject: |
|
|
just saw, that I had an old linux agent (1.25),
just have done the update to 1.29 and now it's basically ok, although I still have a problem with the alerting
I use the test medthod "Text Log" and want to get an alert, when the bad text will written more than 10 times in 5 minutes
I have this configuration:
Use normal status if: ('%FailureIteration%'<10) AND ('%FailureIteration%'>0)
Alertprofile with "Check host again":
('%SimpleStatus%'=='DOWN') OR ('%Status%'=='Normal')
What I have to select in the text log properties?
set "OK" status when no new "bad" records detected
....
Thanks,
Martin |
|
Back to top |
|
|
KS-Soft
Joined: 03 Apr 2002 Posts: 12807 Location: USA
|
Posted: Thu Oct 15, 2015 11:38 am Post subject: |
|
|
Quote: | Alertprofile with "Check host again":
('%SimpleStatus%'=='DOWN') OR ('%Status%'=='Normal') |
I would not use this action.
I would set "warn of all new events" test option
Quote: | What I have to select in the text log properties? |
I don't know. It depends on what exactly data you want to find in the log.
Regards
Alex |
|
Back to top |
|
|
|