Hostmonitor 5.12 is constantly crashing

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

Hostmonitor 5.12 is constantly crashing

Post by sista71 »

Hi all,
we have a brand-new installation of HostMonitor 5.12 on a 2-processor/2GB RAM Windows2003 Standard Server. So there is plenty of resources available. The hostmonitor is running a bit more than 1300 tests on machines all over the world. Is 1300 too much for one system? The new "Estimate load..." feature says no. Anyhow, the hostmonitor service is constantly crashing, without any notice why in the syslog. The Application Eventlog just once in a while shows the following error (but not every time hostmonitor crashes):
Faulting application hostmon.exe, version 0.0.0.0, faulting module kernel32.dll, version 5.2.3790.0, fault address 0x000249d3.

Do you have any idea why this might be happening? Any known issues? Is there any logging I can turn up to get more information on this?

Thanks in advance for your help
sista71
timn
Posts: 184
Joined: Thu Nov 20, 2003 9:57 am
Location: United States

Post by timn »

I can't answer your main question but I can you that we are running approx. 3,000 texts (19 tests/sec) on a similar 2-CPU, 2GB machine. Our OS is Win 2000 Server. We are also running HM 5.12

On rare occassions (once every 2-4 weeks), we will get "out of memory" dialog boxes popping up. (We are exploring this issue on another thread -- see below.) We are running HM as application -- my understanding is that you don't see these dialog boxes when HM is running as a service.

Sounds like you may be seeing a similar problem. How frequently is this occurring?

You may also want to read this thread...

http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1964&highlight=
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

If it was only once every 2 weeks...

Post by sista71 »

Hi timn,
first of all, thanks a lot for your reply. We get this failure about every 10-60 minutes. Yes, we are running hostmonitor as a service, since this gives me the advantage that I can have the service restart automatically whenever it fails. we cannot afford being without this monitoring for longer than a few minutes.
Regarding the thread you mentioned: We are not running any reports at all. Also no ODBC logging (yet). I wanted this to work ok before I turn on ODBC. We have the hostmonitor to send out mails whenever a test fails.
We do have a file virus scanner running on the system (TrendMicro ServerProtect 5.58 ). Should we exclude any directories from scanning?
We also have the newest Compaq Insigh Manager Version running on the server, but not really doing anything yet. This is on hold because of the HostMonitor issues we are experiencing.
My answers to the resource questions:
GDI Objects=229, User Objects=160, Memory=14.664 K, Handles=500, Threads=10
It is hard to get a snapshot of the moment when it fails because it is happening out of the blue.
Hope that clears up some questions for Alex in advance. But, when I understand correctly, the issue in the other thread
http://www.ks-soft.net/cgi-bin/phpBB/vi ... highlight=
is not solved yet, is it?

Regards
sista71
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Do you have installed antivirus monitor, such as Norton Antivirus or McAfee?
Often Norton Antivirus monitor leads to problem like this - crash without error message.
At the same time antivirus scanner does not produce any problems.
But, when I understand correctly, the issue in the other thread
http://www.ks-soft.net/cgi-bin/phpBB/vi ... highlight=
is not solved yet, is it?
Its not solved yet :( But I was able to reproduce resource leakage using James' settings. Looks like some test method under some circumstances works incorrectly... Hope we will find solution soon

Regards
Alex
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

Antivirus Software

Post by sista71 »

Hi Alex,
no we do not have a antivirus monitor installed, just regular antivirus file scanning. So, I guess we can exclude this. Could a memory leak crash hostmonitor every few minutes??

Regards
Silke
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Could a memory leak crash hostmonitor every few minutes
I don't think so.
Are you using SNMP or Traffic Monitor test methods? Windows 2003 has bug in mgmtapi.dll that often generate errors and may be can cause application to crash. Please read this article http://www.ks-soft.net/cgi-bin/phpBB/vi ... php?t=1301

Also, could you try to setup HostMonitor on different system? Just for testing..

Regards
Alex
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

Post by sista71 »

Hi Alex,
we are not using any SNMP or Traffic monitors at all. We also do not use RMA agents. We are getting ready to put this on another system. What be your suggestion as to what OS version would be the most stable? 2000 Server or even Workstation?
Here is an overview of the tests we are performing:
Service 644
UNC resources 382
Ping 74
TCP 103
SMTP 73
POP3 28
Count files 18
URL request 2
Total 1324

I've also noticed a lot of Win32 1722 and 1726 errors sometimes, but I cannot necessarily make a connection between these errors and crashes. And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time?

Regards
Sista71
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Yes, I would recommend Windows 2000 SP4 + security patches. Probably Server edition better optimized for performance but Workstation works good as well.
And I have excessively checked the network performance and there is no errors or packet loss whatsoever. Is there a chance HostMonitor might be overloaded by making al ot of RPC calls at the same time?
HostMonitor just sends request to Windows... How often HM performs UNC and Service tests? Probably you may increase test intervals and decrease "Do not start more than N tests per second" option?

Regards
Alex
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

tested w2k professional SP4

Post by sista71 »

Hi Alex,
we just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. The frequency of the tests is:
Services 1,1/sec
UNC 24,7/min
I cannot really increase the test interval, since we need it set like that. I 've tried several different settings for the "Do not start more than N tests per second" option. I've tried setting it down to 16 and also tried setting it higher to 60. Makes no difference. :-( I am depressed. Is there any way to turn up logging on hostmonitor to see what the problem is?

Regards
Sista71
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

Ok, I think I have to give up on this one...

Post by sista71 »

...but that brings up my next question: I have everything correctly formatted in 5.12. Now I've installed 4.86 and want to migrate/downgrade all my tests to this version. I have copied all *.lst and *.ini files to the new server and also the hml and ~hm files. Everything looks very good besides one (very important) thing. The Action profiles are not accepted by the older version. I have even tried recreating the action profiles manually and then adding the other *.lst, ini and hml files again. It doesn't work :-( Please Alex, tell me there is away that will save me from having to add the action profiles to each single test again?

Thanks
Sista71
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Please Alex, tell me there is away that will save me from having to add the action profiles to each single test again
If your action profiles do not contain "Record HM log" actions (that was implemented in version 5.10), you may change 1st byte of the actions.lst file from 08 to 07, then you will be able to use this file for HostMonitor 4.86

If your action profiles contain "Record HM log" actions, you should remove all these actions, save profiles and then modify 1st byte of the file.

If you do not have utility to edit binary file, send actions.lst file to support@ks-soft.net.

When action profiles will be ready for old version of HostMonitor, copy HML file with tests from version 5.12, start version 4.86 and load tests.
If you load tests before actions.lst modification, HostMonitor will not be able to keep "test->actions" links.
we just installed hostmonitor on another machine (w2K professional SP4). I am afraid it has exactly the same problems as the w2003 Server. I am depressed.
Me too :(
Is there any way to turn up logging on hostmonitor to see what the problem is?
Windows closes HostMonitor without any error message, right? It means HostMonitor does not have any chance to report about problem :(

So, no ODBC logging, no antivirus monitors, and it crashes on Windows 2000 and Windows 2003.
Could you send your settings (all *.LST, *.INI and *.HML files)? HostMonitor cannot successfully perform your tests from our network, but may be I am lucky enough to reproduce this problem.

Regards
Alex
sista71
Posts: 11
Joined: Wed Aug 06, 2003 6:00 pm

Found the culprit!!!

Post by sista71 »

Hi Alex,
I know it.s been a while, but I was quite busy. I finally found some time to do some more testing. I disabled one folder at a time and let hostmonitor run for a while. Then enabled the folder again and went on with the next one. The crashing stopped when I disabled the "Count all files" and "Count old files" tests we have running on 8 machines. I understand that the "Count all files" tests might be quite a load since they have to go through about 400 folders altogether containing about 300 files. Are there any time-outs associated with this test? Is there any config file that I can modify? The "Count old files" tests on the other hand only have to check one folder with usually no files in it and they still let the hostmonitor crash all the time. Is there possibly a test setting I could screw up on? I would be thankful for any suggestion since we do need those tests.

Thanks
Sista71
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

H'm, "Count files" test causes error....
We have checked our code and I am 99.999% sure there are no mistake that could cause HostMonitor to crash. Probably some bug in network client???
You are checking remote system, right? Could you try to install RMA on remote system and perform these tests using agent?

Regards
Alex
User avatar
mpriess
Posts: 112
Joined: Tue Jul 02, 2002 6:00 pm
Location: Arizona, USA

Hostmonitor WAS crashing on us a lot...

Post by mpriess »

We consistently had problems with hostmonitor hanging every few days using the last couple releases and we couldn't find the cause; however, when we upgraded to 5.38 all those issues have gone away. The app has been running well for two weeks straight with no crash\memory leak\etc. We are running Windows 2003 Server ~2000 tests: Load 4 per second

[/img]
KS-Soft
Posts: 12821
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Good news :)
But.. we did not fix any bugs in version 5.38 :roll: We fixed some possible problems in version 5.34. Did you have problems with version 5.34?

Regards
Alex
Post Reply