too many up pages after multiple bad on one machine

rbartels · Post by **rbartels** » Wed Mar 16, 2005 1:34 pm

We are having an information overload issue. When a machine goes down with 6 tests I don't have an issue with knowing about them. ( Like if the RMA service fails but ping still works and everything is dependant on the ping we get pages)
What I would like is that if multiple tests fail on ONE machine can I make it only send me ONE up page when they all check out again? Instead of sending me multiple up pages? It's just too much information.
Is there a way to put all the tests on a machine into one test group?

Thanks

Bob

KS-Soft · Post by **KS-Soft** » Wed Mar 16, 2005 8:28 pm

I think you need to setup some master test (e.g. ping test) to check general status of the system and make other tests dependant on master test.
In this case you will receive one "bad" message when server fails and one "good" message when server comes back to operational status

Regards
Alex

rbartels · Post by **rbartels** » Thu Mar 17, 2005 6:54 am

I have that in place. The tests were dependent on the ping however when the RMA crashes ping still works

I added a RMA test dependancy and that fixed the issue when RMA crashes.

However that does nothing to prevent all the GOOD pages when a services starts working.

How can you make a good page dependent on ALL the bad pages changing back to good?

I don't want a good page for each bad one I got per machine I want ONE good page once they ALL pass again?

Anyone got a clue here?

Thanks

Bob

KS-Soft · Post by **KS-Soft** » Thu Mar 17, 2005 1:27 pm

However that does nothing to prevent all the GOOD pages when a services starts working.

It should prevent any actions assigned to dependant tests. Unless you have marked "Synchronize status&alerts" option.

For example there is master test TestM that checks agent and several dependant tests TestA, TestB, TestC that should be performed when TestM has "good" status.
When master test (TestM) fails, HostMonitor executes action profile assigned to TestM and sets "WaitForMaster" status for TestA, TestB and TestC. HostMonitor will not execute action profiles that are used for TestA, TestB and TestC unless you have marked "Synchronize status&alerts" option.
The same is true when TestA restores "good" status. HostMonitor will execute action profile assigned to TestM and perform TestA, TestB and TestC. If these services are alive, no additional actions will be executed (because TestA, TestB and TestC did not have "bad" status").

So, I think you are reciving "good" messages because either
- TestA, TestB, TestC does not depends on some master test
- you are using "Synchronize status&alerts" option for dependant tests
- Services that are checked by TestA, TestB, TestC fails independantly on TestM
- you are using "Repeat" option that allows to start "good" action several times one by one

How can you make a good page dependent on ALL the bad pages changing back to good?

If you are asking about some independant tests, use "advanced" actions with triggering condition similar to ('%::TestA::SimpleStatus'=='UP') and ('%::TestB::SimpleStatus'=='UP') and ('%::TestC::SimpleStatus'=='UP') and (%::TestA::Recurrences==1) and (%::TestB::Recurrences==1) and (%::TestC::Recurrences==1)

Regards
Alex