Hostmonitor 5.12 is constantly crashing
Hostmonitor 5.12 is constantly crashing
Hi all,
we have a brand-new installation of HostMonitor 5.12 on a 2-processor/2GB RAM Windows2003 Standard Server. So there is plenty of resources available. The hostmonitor is running a bit more than 1300 tests on machines all over the world. Is 1300 too much for one system? The new "Estimate load..." feature says no. Anyhow, the hostmonitor service is constantly crashing, without any notice why in the syslog. The Application Eventlog just once in a while shows the following error (but not every time hostmonitor crashes):
Faulting application hostmon.exe, version 0.0.0.0, faulting module kernel32.dll, version 5.2.3790.0, fault address 0x000249d3.
Do you have any idea why this might be happening? Any known issues? Is there any logging I can turn up to get more information on this?
Thanks in advance for your help
sista71
we have a brand-new installation of HostMonitor 5.12 on a 2-processor/2GB RAM Windows2003 Standard Server. So there is plenty of resources available. The hostmonitor is running a bit more than 1300 tests on machines all over the world. Is 1300 too much for one system? The new "Estimate load..." feature says no. Anyhow, the hostmonitor service is constantly crashing, without any notice why in the syslog. The Application Eventlog just once in a while shows the following error (but not every time hostmonitor crashes):
Faulting application hostmon.exe, version 0.0.0.0, faulting module kernel32.dll, version 5.2.3790.0, fault address 0x000249d3.
Do you have any idea why this might be happening? Any known issues? Is there any logging I can turn up to get more information on this?
Thanks in advance for your help
sista71
I can't answer your main question but I can you that we are running approx. 3,000 texts (19 tests/sec) on a similar 2-CPU, 2GB machine. Our OS is Win 2000 Server. We are also running HM 5.12
On rare occassions (once every 2-4 weeks), we will get "out of memory" dialog boxes popping up. (We are exploring this issue on another thread -- see below.) We are running HM as application -- my understanding is that you don't see these dialog boxes when HM is running as a service.
Sounds like you may be seeing a similar problem. How frequently is this occurring?
You may also want to read this thread...
http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1964&highlight=
On rare occassions (once every 2-4 weeks), we will get "out of memory" dialog boxes popping up. (We are exploring this issue on another thread -- see below.) We are running HM as application -- my understanding is that you don't see these dialog boxes when HM is running as a service.
Sounds like you may be seeing a similar problem. How frequently is this occurring?
You may also want to read this thread...
http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1964&highlight=
If it was only once every 2 weeks...
Hi timn,
first of all, thanks a lot for your reply. We get this failure about every 10-60 minutes. Yes, we are running hostmonitor as a service, since this gives me the advantage that I can have the service restart automatically whenever it fails. we cannot afford being without this monitoring for longer than a few minutes.
Regarding the thread you mentioned: We are not running any reports at all. Also no ODBC logging (yet). I wanted this to work ok before I turn on ODBC. We have the hostmonitor to send out mails whenever a test fails.
We do have a file virus scanner running on the system (TrendMicro ServerProtect 5.58 ). Should we exclude any directories from scanning?
We also have the newest Compaq Insigh Manager Version running on the server, but not really doing anything yet. This is on hold because of the HostMonitor issues we are experiencing.
My answers to the resource questions:
GDI Objects=229, User Objects=160, Memory=14.664 K, Handles=500, Threads=10
It is hard to get a snapshot of the moment when it fails because it is happening out of the blue.
Hope that clears up some questions for Alex in advance. But, when I understand correctly, the issue in the other thread
http://www.ks-soft.net/cgi-bin/phpBB/vi ... highlight=
is not solved yet, is it?
Regards
sista71
first of all, thanks a lot for your reply. We get this failure about every 10-60 minutes. Yes, we are running hostmonitor as a service, since this gives me the advantage that I can have the service restart automatically whenever it fails. we cannot afford being without this monitoring for longer than a few minutes.
Regarding the thread you mentioned: We are not running any reports at all. Also no ODBC logging (yet). I wanted this to work ok before I turn on ODBC. We have the hostmonitor to send out mails whenever a test fails.
We do have a file virus scanner running on the system (TrendMicro ServerProtect 5.58 ). Should we exclude any directories from scanning?
We also have the newest Compaq Insigh Manager Version running on the server, but not really doing anything yet. This is on hold because of the HostMonitor issues we are experiencing.
My answers to the resource questions:
GDI Objects=229, User Objects=160, Memory=14.664 K, Handles=500, Threads=10
It is hard to get a snapshot of the moment when it fails because it is happening out of the blue.
Hope that clears up some questions for Alex in advance. But, when I understand correctly, the issue in the other thread
http://www.ks-soft.net/cgi-bin/phpBB/vi ... highlight=
is not solved yet, is it?
Regards
sista71
Do you have installed antivirus monitor, such as Norton Antivirus or McAfee?
Often Norton Antivirus monitor leads to problem like this - crash without error message.
At the same time antivirus scanner does not produce any problems.
Regards
Alex
Often Norton Antivirus monitor leads to problem like this - crash without error message.
At the same time antivirus scanner does not produce any problems.
Its not solved yet But I was able to reproduce resource leakage using James' settings. Looks like some test method under some circumstances works incorrectly... Hope we will find solution soonBut, when I understand correctly, the issue in the other thread
http://www.ks-soft.net/cgi-bin/phpBB/vi ... highlight=
is not solved yet, is it?
Regards
Alex
Antivirus Software
Hi Alex,
no we do not have a antivirus monitor installed, just regular antivirus file scanning. So, I guess we can exclude this. Could a memory leak crash hostmonitor every few minutes??
Regards
Silke
no we do not have a antivirus monitor installed, just regular antivirus file scanning. So, I guess we can exclude this. Could a memory leak crash hostmonitor every few minutes??
Regards
Silke
I don't think so.Could a memory leak crash hostmonitor every few minutes
Are you using SNMP or Traffic Monitor test methods? Windows 2003 has bug in mgmtapi.dll that often generate errors and may be can cause application to crash. Please read this article http://www.ks-soft.net/cgi-bin/phpBB/vi ... php?t=1301
Also, could you try to setup HostMonitor on different system? Just for testing..
Regards
Alex
Hi Alex,
we are not using any SNMP or Traffic monitors at all. We also do not use RMA agents. We are getting ready to put this on another system. What be your suggestion as to what OS version would be the most stable? 2000 Server or even Workstation?
Here is an overview of the tests we are performing:
Service 644
UNC resources 382
Ping 74
TCP 103
SMTP 73
POP3 28
Count files 18
URL request 2
Total 1324
I've also noticed a lot of Win32 1722 and 1726 errors sometimes, but I cannot necessarily make a connection between these errors and crashes. And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time?
Regards
Sista71
we are not using any SNMP or Traffic monitors at all. We also do not use RMA agents. We are getting ready to put this on another system. What be your suggestion as to what OS version would be the most stable? 2000 Server or even Workstation?
Here is an overview of the tests we are performing:
Service 644
UNC resources 382
Ping 74
TCP 103
SMTP 73
POP3 28
Count files 18
URL request 2
Total 1324
I've also noticed a lot of Win32 1722 and 1726 errors sometimes, but I cannot necessarily make a connection between these errors and crashes. And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time?
Regards
Sista71
Yes, I would recommend Windows 2000 SP4 + security patches. Probably Server edition better optimized for performance but Workstation works good as well.
Regards
Alex
HostMonitor just sends request to Windows... How often HM performs UNC and Service tests? Probably you may increase test intervals and decrease "Do not start more than N tests per second" option?And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time?
Regards
Alex
tested w2k professional SP4
Hi Alex,
we just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. The frequency of the tests is:
Services 1,1/sec
UNC 24,7/min
I cannot really increase the test interval, since we need it set like that. I 've tried several different settings for the "Do not start more than N tests per second" option. I've tried setting it down to 16 and also tried setting it higher to 60. Makes no difference. I am depressed. Is there any way to turn up logging on hostmonitor to see what the problem is?
Regards
Sista71
we just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. The frequency of the tests is:
Services 1,1/sec
UNC 24,7/min
I cannot really increase the test interval, since we need it set like that. I 've tried several different settings for the "Do not start more than N tests per second" option. I've tried setting it down to 16 and also tried setting it higher to 60. Makes no difference. I am depressed. Is there any way to turn up logging on hostmonitor to see what the problem is?
Regards
Sista71
Ok, I think I have to give up on this one...
...but that brings up my next question: I have everything correctly formatted in 5.12. Now I've installed 4.86 and want to migrate/downgrade all my tests to this version. I have copied all *.lst and *.ini files to the new server and also the hml and ~hm files. Everything looks very good besides one (very important) thing. The Action profiles are not accepted by the older version. I have even tried recreating the action profiles manually and then adding the other *.lst, ini and hml files again. It doesn't work Please Alex, tell me there is away that will save me from having to add the action profiles to each single test again?
Thanks
Sista71
Thanks
Sista71
If your action profiles do not contain "Record HM log" actions (that was implemented in version 5.10), you may change 1st byte of the actions.lst file from 08 to 07, then you will be able to use this file for HostMonitor 4.86Please Alex, tell me there is away that will save me from having to add the action profiles to each single test again
If your action profiles contain "Record HM log" actions, you should remove all these actions, save profiles and then modify 1st byte of the file.
If you do not have utility to edit binary file, send actions.lst file to support@ks-soft.net.
When action profiles will be ready for old version of HostMonitor, copy HML file with tests from version 5.12, start version 4.86 and load tests.
If you load tests before actions.lst modification, HostMonitor will not be able to keep "test->actions" links.
Me toowe just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. I am depressed.
Windows closes HostMonitor without any error message, right? It means HostMonitor does not have any chance to report about problemIs there any way to turn up logging on hostmonitor to see what the problem is?
So, no ODBC logging, no antivirus monitors, and it crashes on Windows 2000 and Windows 2003.
Could you send your settings (all *.LST, *.INI and *.HML files)? HostMonitor cannot successfully perform your tests from our network, but may be I am lucky enough to reproduce this problem.
Regards
Alex
Found the culprit!!!
Hi Alex,
I know it.s been a while, but I was quite busy. I finally found some time to do some more testing. I disabled one folder at a time and let hostmonitor run for a while. Then enabled the folder again and went on with the next one. The crashing stopped when I disabled the "Count all files" and "Count old files" tests we have running on 8 machines. I understand that the "Count all files" tests might be quite a load since they have to go through about 400 folders altogether containing about 300 files. Are there any time-outs associated with this test? Is there any config file that I can modify? The "Count old files" tests on the other hand only have to check one folder with usually no files in it and they still let the hostmonitor crash all the time. Is there possibly a test setting I could screw up on? I would be thankful for any suggestion since we do need those tests.
Thanks
Sista71
I know it.s been a while, but I was quite busy. I finally found some time to do some more testing. I disabled one folder at a time and let hostmonitor run for a while. Then enabled the folder again and went on with the next one. The crashing stopped when I disabled the "Count all files" and "Count old files" tests we have running on 8 machines. I understand that the "Count all files" tests might be quite a load since they have to go through about 400 folders altogether containing about 300 files. Are there any time-outs associated with this test? Is there any config file that I can modify? The "Count old files" tests on the other hand only have to check one folder with usually no files in it and they still let the hostmonitor crash all the time. Is there possibly a test setting I could screw up on? I would be thankful for any suggestion since we do need those tests.
Thanks
Sista71
H'm, "Count files" test causes error....
We have checked our code and I am 99.999% sure there are no mistake that could cause HostMonitor to crash. Probably some bug in network client???
You are checking remote system, right? Could you try to install RMA on remote system and perform these tests using agent?
Regards
Alex
We have checked our code and I am 99.999% sure there are no mistake that could cause HostMonitor to crash. Probably some bug in network client???
You are checking remote system, right? Could you try to install RMA on remote system and perform these tests using agent?
Regards
Alex
Hostmonitor WAS crashing on us a lot...
We consistently had problems with hostmonitor hanging every few days using the last couple releases and we couldn't find the cause; however, when we upgraded to 5.38 all those issues have gone away. The app has been running well for two weeks straight with no crash\memory leak\etc. We are running Windows 2003 Server ~2000 tests: Load 4 per second
[/img]
[/img]