Tool to monitor HM / RMA
I wrote a web application in PHP/MySQL along with a php-based daemon that checks the database and sends the messages.
It is also proving handy to track backup jobs, database dumps and other scheduled tasks to alert me if they failed to execute when expected.
What I wrote is tailored to my environment but it probably wouldn't take much to re-tool it to be more universally applicable.
It is also proving handy to track backup jobs, database dumps and other scheduled tasks to alert me if they failed to execute when expected.
What I wrote is tailored to my environment but it probably wouldn't take much to re-tool it to be more universally applicable.
We think HostMonitor can monitor everything including another monitor. However "some people"
still want to have extra application.
Ok, we have included WatchDog application into Enterprise package. http://www.ks-soft.net/hostmon.eng/watchdog/index.htm
Beta version so far...
So we do not need to provide 2nd HostMonitor license for purpose of monitoring HostMonitor anymore.
At the same time we added new test method to HostMonitor - HM Monitor.
http://www.ks-soft.net/hostmon.eng/mfra ... #hmmonitor
It provides much more options than simple TCP check and may monitor various HostMonitor parameters.
Regards
Alex

Ok, we have included WatchDog application into Enterprise package. http://www.ks-soft.net/hostmon.eng/watchdog/index.htm
Beta version so far...
So we do not need to provide 2nd HostMonitor license for purpose of monitoring HostMonitor anymore.
At the same time we added new test method to HostMonitor - HM Monitor.
http://www.ks-soft.net/hostmon.eng/mfra ... #hmmonitor
It provides much more options than simple TCP check and may monitor various HostMonitor parameters.
Regards
Alex
-
- Posts: 229
- Joined: Tue Jun 20, 2006 1:20 pm
- Location: Montreal, Quebec
I apologise, I wrote that a bit quickly.
It relates to my other post about the problems if HM is restarted or goes down, what happens to the NT Event Log tests for example is inconsistent and easy to misunderstand. I still am unsure how to check what period the test covered in the event logs.
For example:
The primary HM goes down and WatchDog starts a new instance on the WatchDog server. It loads the same tests and log files, etc. Will it act as though it was running the whole time, or will it only check NT Event Logs (for example) from the time the backup HM was started?
If HM was 10 minutes away from running a test the NT Event Log on a 24 hour schedule, I might only get 10 minutes of the NT Event Log looked at for the event instead of the 24 hours. Ideally in a failover situation HM would carry on as if it was always running, so that you have no variation in the tests and alerts.
It relates to my other post about the problems if HM is restarted or goes down, what happens to the NT Event Log tests for example is inconsistent and easy to misunderstand. I still am unsure how to check what period the test covered in the event logs.
For example:
The primary HM goes down and WatchDog starts a new instance on the WatchDog server. It loads the same tests and log files, etc. Will it act as though it was running the whole time, or will it only check NT Event Logs (for example) from the time the backup HM was started?
If HM was 10 minutes away from running a test the NT Event Log on a 24 hour schedule, I might only get 10 minutes of the NT Event Log looked at for the event instead of the 24 hours. Ideally in a failover situation HM would carry on as if it was always running, so that you have no variation in the tests and alerts.
It depends on how and when exactly you copy your configuration files from one system to another.The primary HM goes down and WatchDog starts a new instance on the WatchDog server. It loads the same tests and log files, etc. Will it act as though it was running the whole time, or will it only check NT Event Logs (for example) from the time the backup HM was started?
If you just start another instance without any setup, it will not be able to monitor anything. You need to save and copy your configuration files from one system to another. E.g. you may use scheduled HM Script action with SaveTestList and StartProgram commands to save current HostMonitor status (test list) and copy configuration files from one system to another every 10 min.
If we are talking about NT Event Log test, then HostMonitor does not check "old" events after restart.If HM was 10 minutes away from running a test the NT Event Log on a 24 hour schedule, I might only get 10 minutes of the NT Event Log looked at for the event instead of the 24 hours.
BTW: What is the reason to perform such check just once a day? I think if some problem happens, the sooner you know the better.
If we are talking about "clustering" monitoring, we have such task in our "to do" list. On the other hand... I am sure there are such software on the marked but as I know such software costs about 50-100 times more that HostMonitor. Sure, if we create such version, we will sell it on lower price, not $100,000 per license. However it will be more expensive than our "regular" Enterprise license. If you need failover monitoring solution on very low price, I think its better to use 2 HostMonitors or HostMonitor+WatchDog and some simple custom made scripts. You may setup such system (with some minor disadvantages) right now.Ideally in a failover situation HM would carry on as if it was always running, so that you have no variation in the tests and alerts.
Regards
Alex
Thanks for the tip, will configure it to do this.KS-Soft wrote:You need to save and copy your configuration files from one system to another. E.g. you may use scheduled HM Script action with SaveTestList and StartProgram commands to save current HostMonitor status (test list) and copy configuration files from one system to another every 10 min.
Just an example, but in that case the event will only ever be triggered at close to a specific time once a day, so I run the test just after that. I guess this goes with my other post of being able to specify the time range to check the log, in this case I would just check between 5am and 5:20am for example. But the example also applies if you test every 10 minutes, if the system fail-over 30 seconds before the test is about to run then there is a 9-10 min window where the logs will not get checked.BTW: What is the reason to perform such check just once a day? I think if some problem happens, the sooner you know the better.
Will be interesting to see that but perhaps it might be priced above the level that is profitable for me with my small business customer monitoring. I think the fail-over will be close to perfect and if the event log workings are tweaked a little to avoid missing events during a failure then this will be good enough for me.If we are talking about "clustering" monitoring, we have such task in our "to do" list. ... If you need fail-over monitoring solution on very low price, I think its better to use 2 HostMonitors or HostMonitor+WatchDog and some simple custom made scripts. You may setup such system (with some minor disadvantages) right now.
Cheers
Mark