Disable alarms when RMA cannot retrieve information
Disable alarms when RMA cannot retrieve information
Hi,
Is there a way to disable alarms when the RMA cannot connect or retrieve the information?
I only want to have an alarm when it´s a real alarm.
Is this possible?
//Andreas..
Is there a way to disable alarms when the RMA cannot connect or retrieve the information?
I only want to have an alarm when it´s a real alarm.
Is this possible?
//Andreas..
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
Basically, you may disable "Treat Unknown status as Bad" option, located on "Test properties" dialog.
However, we do not recommend this, bacuse real issue can be missed.
If RMA cannot connect to target host or cannot retrieve information from time to tome - you may modify action profile to start alerts when 2 or 3 consecutive Bad status returned by the test (Start when N consecutive Bad/Good results occur option).
On the other hand, if you get false-positive alerts when HostMonitor cannot connect to RMA (or Active RMA cannot onnect to HostMonitor), we recommend to use Master test for all test items, performed by RMA.
Please check for details at:
Treat Unknown status as Bad: http://www.ks-soft.net/hostmon.eng/mfra ... knownisbad
Start when N consecutive Bad/Good results occur: http://www.ks-soft.net/hostmon.eng/mfra ... #StartWhen
Master/Dependant test items: http://www.ks-soft.net/hostmon.eng/mfra ... htm#Master
However, we do not recommend this, bacuse real issue can be missed.
If RMA cannot connect to target host or cannot retrieve information from time to tome - you may modify action profile to start alerts when 2 or 3 consecutive Bad status returned by the test (Start when N consecutive Bad/Good results occur option).
On the other hand, if you get false-positive alerts when HostMonitor cannot connect to RMA (or Active RMA cannot onnect to HostMonitor), we recommend to use Master test for all test items, performed by RMA.
Please check for details at:
Treat Unknown status as Bad: http://www.ks-soft.net/hostmon.eng/mfra ... knownisbad
Start when N consecutive Bad/Good results occur: http://www.ks-soft.net/hostmon.eng/mfra ... #StartWhen
Master/Dependant test items: http://www.ks-soft.net/hostmon.eng/mfra ... htm#Master
Thanks for your reply!
Most of my alerts(sadly..) are what I would call false alerts. That is, the RMA cannot connect to the target for some unknown reason, but the target server is working as it should.
At the moment all of my tests are set to use "(%FailureIteration% > 0) and (%FailureIteration% < 3)" together with the flags "Treat Unknown status as Bad", "Treat Warning status as Bad" and "Use Normal status if".
that works as I want since a cpu peak at 100% doesn't trigger an alarm unless its three test-warnings in a row.
Every test also has a master test which always is set to the PING test of the target server. This problem doesn´t occur on PING test which always answers. Only on more difficult test like "Process" or "CPU Usage".
I also do not want to miss alerts that contains a real error message like "RMA: 301 - Access is denied".
I would like to treat all the "RMA: 301 - " errors as real errors but they seldom are. If I knew why all my RMAs gave me these errors all the time I´d be a happy camper
I feels more as if there is a timeout problem when the RMA tries to communicate with the target.
Most of my alerts(sadly..) are what I would call false alerts. That is, the RMA cannot connect to the target for some unknown reason, but the target server is working as it should.
At the moment all of my tests are set to use "(%FailureIteration% > 0) and (%FailureIteration% < 3)" together with the flags "Treat Unknown status as Bad", "Treat Warning status as Bad" and "Use Normal status if".
that works as I want since a cpu peak at 100% doesn't trigger an alarm unless its three test-warnings in a row.
Every test also has a master test which always is set to the PING test of the target server. This problem doesn´t occur on PING test which always answers. Only on more difficult test like "Process" or "CPU Usage".
I also do not want to miss alerts that contains a real error message like "RMA: 301 - Access is denied".
I would like to treat all the "RMA: 301 - " errors as real errors but they seldom are. If I knew why all my RMAs gave me these errors all the time I´d be a happy camper

I feels more as if there is a timeout problem when the RMA tries to communicate with the target.
The most common tests I use that very often gives me intermittent RMA errors are "Process" and "CPU Usage". And I have checked the system at the same time and they shouldn´t alert. They give RMA errors for a couple of minutes and then they are fine again. Also they almost always occur at the same time.
Other tests on the exact same target is Good.
Other tests on the exact same target is Good.
I have done some testing in my environment so I can pinpoint my problem and explain it to you more exactly.
My issue does not have anything to do with a specific test. It´s only related to the RMA when the agent cannot connect to the hostmon server.
Example:
I have an active RMA that monitors 30 servers. All tests have a master test that always is the PING test on every server. All tests are also configured to only trigger the alert when it has occurred three consecutive times.
If the server that the RMA is installed on is shutdown. Or the RMA service is stopped, I immediately get 30 alerts(in my case 30 emails) telling me that the PING test has status "RMA not connected, Unknown".
If none of the tests had a master test set, I would get hundreds of email alerts.
What I would like to have is just ONE alert telling me that this RMA is not connected to the Hostmon server, or similar. And I also want this alert to trigger in a similar way as the tests. That is when it has failed three times in a row to connect over a minimum time period, so It matches all my tests.
As it is now I get lots of alerts almost immediately when a RMA server loses connection, even if it´s just for a couple of seconds(at least that´s how I experience it).
I hope I have explaned my problem better now?
//Andreas..
My issue does not have anything to do with a specific test. It´s only related to the RMA when the agent cannot connect to the hostmon server.
Example:
I have an active RMA that monitors 30 servers. All tests have a master test that always is the PING test on every server. All tests are also configured to only trigger the alert when it has occurred three consecutive times.
If the server that the RMA is installed on is shutdown. Or the RMA service is stopped, I immediately get 30 alerts(in my case 30 emails) telling me that the PING test has status "RMA not connected, Unknown".
If none of the tests had a master test set, I would get hundreds of email alerts.
What I would like to have is just ONE alert telling me that this RMA is not connected to the Hostmon server, or similar. And I also want this alert to trigger in a similar way as the tests. That is when it has failed three times in a row to connect over a minimum time period, so It matches all my tests.
As it is now I get lots of alerts almost immediately when a RMA server loses connection, even if it´s just for a couple of seconds(at least that´s how I experience it).
I hope I have explaned my problem better now?
//Andreas..
Quote from the manual
------------------------------
How to check status of the agent
When you use single agent to perform various tests, you may want to setup Master test to check agent status. In such case HostMonitor may send single alert informing you about disconnected agent and hold dependant test items (so you will receive single alert instead of hundred warnings for all tests performed by this agent).
What test can you use as Master test?
- Passive RMA: It's pretty easy to check Passive RMA status when you do not use Backup RMA. You may use TCP test to check is RMA receives connections on specified TCP port or use Ping test performed by the agent to check localhost (127.0.0.1). Such Ping test will always return "Host is alive" status when successfully authenticated connection to the agent can be established.
If there is backup agent in use, you may setup Ping test performed by Passive RMA using rma itself string as target host name. In such case HostMonitor will check agent status without using backup agent even if such backup agent was specified for selected RMA
- Active RMA: Things a little more tricky when you need to check Active RMA status. Yes, you may setup the same "Ping localhost" test to check agent status. However this may lead to some delay in reaction as HostMonitor will not perform test if agent was connected but lost connection just a moment ago. HostMonitor may wait up to several minutes for new connection before changing test status to "Unknown". Dependant test items will be delayed as well so such delay is not a big problem, you will not receive a lot of alerts. However if for some reason you need to receive alert immediately, there is solution: setup Ping test using Active RMA and type rma itself string instead of localhost or target host name. In such case HostMonitor will display agent status immediately. HostMonitor will not use backup agent (if any) when specified agent is not connected; also HostMonitor will not wait for agent re-connection.
------------------------------
May be you setup Backup RMA and use "if backup agent specified and connection to primary RMA failed: setup Unknown status" option?
If agent checks many systems, I think backup RMA installed on different system can be very useful.
Regards
Alex
------------------------------
How to check status of the agent
When you use single agent to perform various tests, you may want to setup Master test to check agent status. In such case HostMonitor may send single alert informing you about disconnected agent and hold dependant test items (so you will receive single alert instead of hundred warnings for all tests performed by this agent).
What test can you use as Master test?
- Passive RMA: It's pretty easy to check Passive RMA status when you do not use Backup RMA. You may use TCP test to check is RMA receives connections on specified TCP port or use Ping test performed by the agent to check localhost (127.0.0.1). Such Ping test will always return "Host is alive" status when successfully authenticated connection to the agent can be established.
If there is backup agent in use, you may setup Ping test performed by Passive RMA using rma itself string as target host name. In such case HostMonitor will check agent status without using backup agent even if such backup agent was specified for selected RMA
- Active RMA: Things a little more tricky when you need to check Active RMA status. Yes, you may setup the same "Ping localhost" test to check agent status. However this may lead to some delay in reaction as HostMonitor will not perform test if agent was connected but lost connection just a moment ago. HostMonitor may wait up to several minutes for new connection before changing test status to "Unknown". Dependant test items will be delayed as well so such delay is not a big problem, you will not receive a lot of alerts. However if for some reason you need to receive alert immediately, there is solution: setup Ping test using Active RMA and type rma itself string instead of localhost or target host name. In such case HostMonitor will display agent status immediately. HostMonitor will not use backup agent (if any) when specified agent is not connected; also HostMonitor will not wait for agent re-connection.
------------------------------
immediately? If connection to agent lost, HostMonitor should wait several minutes for new connection before changing test status.Or the RMA service is stopped, I immediately get 30 alerts
May be you setup Backup RMA and use "if backup agent specified and connection to primary RMA failed: setup Unknown status" option?
If agent checks many systems, I think backup RMA installed on different system can be very useful.
Regards
Alex
Is this doable and reliable with only one RMA? Should I set this up so ALL my tests using that RMA also add the extra Master test "check service RMA agent"? If I understand correctly I will test this. Sounds interesting.When you use single agent to perform various tests, you may want to setup Master test to check agent status.
immediately? If connection to agent lost, HostMonitor should wait several minutes for new connection before changing test status.
I just tested this again and it took 15 seconds to get the first email-alert from when I stopped the RMA service. 5 seconds later came the second one. I think this was because thats when the first test using that RMA was scheduled to run, which triggered the alarm.
I´d love to use backup RMAs everywhere, if they were free. Maybe I´ll put that on this forums wishlist, heheMay be you setup Backup RMA and use "if backup agent specified and connection to primary RMA failed: setup Unknown status" option?
If agent checks many systems, I think backup RMA installed on different system can be very useful.

Thanks,
//Andreas..
Master test for Active RMA? Sure, it works fine.Is this doable and reliable with only one RMA?
Could you send config files to support@ks-soft.net?I just tested this again and it took 15 seconds to get the first email-alert from when I stopped the RMA service. 5 seconds later came the second one. I think this was because thats when the first test using that RMA was scheduled to run, which triggered the alarm.
At least hostmon.ini, agents.lst, actions.lst and HML file with tests.
10 RMA for $250I´d love to use backup RMAs everywhere, if they were free. Maybe I´ll put that on this forums wishlist, hehe
100 RMA for $900 - its just $9 per license. Almost free
Regards
Alex
I created a test on the RMA server that did a servicetest for the local "ActiveRMAService" without any mastertest. Then all PING tests on my 30 servers have this new test as their master test. and all the rest of the tests have their respective servers PING-test as master. Seems to work exactly as I wanted!
Thanks for your help!
//Andreas..
Thanks for your help!
//Andreas..
I´m not sure what you mean? I have a PING test that the RMA performs on itself, and that is master for the rest of the tests on the RMA server, except this new service test.Why you did not setup special test designed to check RMA status?
(Ping test using special "rma itself" string instead target host name)
Should I use that PING test as master for the other target server ping tests? Is that better than checking for the service?
//Andreas..
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact: