KS-Soft. Network Management Solutions
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister    ProfileProfile    Log inLog in 

Hostmonitor Crashes after importing 500 tests and log events

 
Post new topic   Reply to topic    KS-Soft Forum Index -> Configuration, Maintenance, Troubleshooting
View previous topic :: View next topic  
Author Message
zendesigner



Joined: 03 Mar 2005
Posts: 40

PostPosted: Mon May 09, 2005 7:26 am    Post subject: Hostmonitor Crashes after importing 500 tests and log events Reply with quote

Hello alex,

We're currently doing test on our hostmonitor setup.

We import many tests from a database at once using scripting.
About 500 of them are imported at once. 100 of them immediatly test with a bad status triggering an event in the event viewer. The rest are dependant on the first 100.

Just after writing the events to the event log , hostmonitor service crashes.

I tried with version 5.12 and version 5.22 and it stays the same.

Any idea on what to do ?
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12794
Location: USA

PostPosted: Mon May 09, 2005 10:36 am    Post subject: Reply with quote

Cannot reproduce such problem on our systems. What Windows do you use? Service Pack? Antivirus Monitor?

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
zendesigner



Joined: 03 Mar 2005
Posts: 40

PostPosted: Tue May 10, 2005 1:34 am    Post subject: Reply with quote

Win2003 Server standard edition build 5.2.3790. Also has the problem with virusscan (mcaffee enterprise) off.

Maybe is related to other access violation problem i saw here below in the forum.

Managed to extract an event log from the crash

Event Type: Error
Event Source: hostmon.exe
Event Category: None
Event ID: 0
Date: 10/05/2005
Time: 09:03:09
User: N/A
Computer: SNFW0006
Description:
The description for Event ID ( 0 ) in Source ( hostmon.exe ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: Access violation at address 00401F87 in module 'hostmon.exe'. Read of address 00000002.


Is this identical to this thread ?

http://www.ks-soft.net/cgi-bin/phpBB/viewtopic.php?t=1283

If so i have a problem as i'm not allowed to copy over dll's at will (bank network)

Thanks

Bart
Back to top
View user's profile Send private message
zendesigner



Joined: 03 Mar 2005
Posts: 40

PostPosted: Tue May 10, 2005 2:37 am    Post subject: Reply with quote

i'm testing with mgmtapi.dll hotfix from microsoft now.

I'll let you know if it fixes this.
Back to top
View user's profile Send private message
zendesigner



Joined: 03 Mar 2005
Posts: 40

PostPosted: Tue May 10, 2005 5:32 am    Post subject: Reply with quote

The problem isn't related to the microsoft Hotfix.

The problem consist of:

It's is related to the number of bad events have to be logged into the event viewer in the same instant. When the test are imported they generate a bad alert in the same second they start as the servers don't exist at the moment. Hostmonitor then writes immediatly to the event log for each test failing generating an error in the event viewer with id 2001 and a certain description string.

It does this ok if test failing toghether (and thus events written) don't exceed around 100 tests. If higher then that the service crashes. It doesn't matter if different tests use different action profiles that log different event errors. It's somehow related to the number of events written to the event log at the same time.

As this put's a rather dark cloud above the complete implementation we need this resolved. We will be testing about 2500 servers with 7 tests per server split over 16 hostmonitor machines.

Thanks

Bart
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12794
Location: USA

PostPosted: Tue May 10, 2005 8:20 am    Post subject: Reply with quote

May be its not "Event Log" problem... Do you use ODBC logging as well? What ODBC driver do you use?
BTW if you create test items for non-existent server, may be you should create some "master" test that will check server availability?

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
KS-Soft



Joined: 03 Apr 2002
Posts: 12794
Location: USA

PostPosted: Tue May 10, 2005 12:50 pm    Post subject: Reply with quote

After a while I was able to reproduce this problem. Looks like problem really relates to "Event Log" action. I am checking our code but it looks fine... will check Microsoft knowledge base...

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
zendesigner



Joined: 03 Mar 2005
Posts: 40

PostPosted: Tue May 10, 2005 11:28 pm    Post subject: Reply with quote

KS-Soft wrote:
May be its not "Event Log" problem... Do you use ODBC logging as well? What ODBC driver do you use?
BTW if you create test items for non-existent server, may be you should create some "master" test that will check server availability?

Regards
Alex

After a while I was able to reproduce this problem. Looks like problem really relates to "Event Log" action. I am checking our code but it looks fine... will check Microsoft knowledge base...

Regards
Alex


Thanks Alex,

It's only related to the event logging feature. I tried with having a batch file run with those alerts and it works without problem.

I'm actually only testing our automatic import system from an oracle database. This database gives info on the production and QA systems but not about servers on my test network. That's why i get all tests failing, as they are already dependant on a test (wait for master) in my import routine, they all start at once and thus generate all those event logs.

I can't overlook a problem like this as for example a subnet with 200 servers should fail it would need to be logged in to the event viewer as well. From there Tivoli picks up the event logs and processes service tickets through them. So if hostmonitor should fail on that the backup server would take over but also immediatly fail as he will try to generate the same events in his event log.

I'm actually using no other logging or alerting at all. Hostmonitor will only be a heartbeat test function for tivoli here.

Thanks , i'll do a search as well on microsoft, maybe it's security related for example not allowing the event viewer to be dumped with bogus events to erase the previous logging or something
Back to top
View user's profile Send private message
Marcus



Joined: 18 Nov 2002
Posts: 367

PostPosted: Wed May 11, 2005 8:16 am    Post subject: Reply with quote

Quote:
From there Tivoli picks up the event logs and processes service tickets through them
If possible / permitted you could sent an snmp trap (as work-around).

We send snmp traps to our management server (HP Openview) for every status change HostMonitor detects.
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12794
Location: USA

PostPosted: Wed May 11, 2005 12:57 pm    Post subject: Reply with quote

I tested HostMonitor all night - it looks like Windows cannot handle many requests to Event log at the same time.
- If HostMonitor sends about 30 (or more) requests at the same time (from different threads), application crashes. Looks like system DLL advapi32.dll causes crash.
- I changed HostMonitor's code, tried to send requests one by one but without delay. Again, if HM sends about 30 requests per second, DLL crashes.
- Then I tried to send only 8 requests per second. And it works, HM send more than 200,000 requests without problem.

We can add delay procedure into HostMonitor's code for this action method. But if you have thosands of tests and you want to record all events into NT Event Log, you will experience another problem - performance problem. Looks like Windows cannot handle such amount of requests.
Are you sure you need NT Event Log?

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
zendesigner



Joined: 03 Mar 2005
Posts: 40

PostPosted: Thu May 12, 2005 12:01 am    Post subject: Reply with quote

KS-Soft wrote:
I tested HostMonitor all night - it looks like Windows cannot handle many requests to Event log at the same time.
- If HostMonitor sends about 30 (or more) requests at the same time (from different threads), application crashes. Looks like system DLL advapi32.dll causes crash.
- I changed HostMonitor's code, tried to send requests one by one but without delay. Again, if HM sends about 30 requests per second, DLL crashes.
- Then I tried to send only 8 requests per second. And it works, HM send more than 200,000 requests without problem.

We can add delay procedure into HostMonitor's code for this action method. But if you have thosands of tests and you want to record all events into NT Event Log, you will experience another problem - performance problem. Looks like Windows cannot handle such amount of requests.
Are you sure you need NT Event Log?

Regards
Alex


Hi Alex,

Thanks for your testing. I really need the event log as due to the procedures here at the bank, it's kinda fixed way of doing things and i can't get around it .

I implemented a workaround now wich is kinda the same as the delay you propose. We currently have written a VB script that logs the variables from the description into the event viewer as an external program. It sure increases the load on the server because it opens a seperate instance for each event logging. But also it opens those instances one after another and solves the problem. we use the "external program" in the action profile to start this script.

As i'm pressed hard for time to start kitting Hostmonitor for installation and testing, this is an acceptable workaround for me now. The bank put less emphasis on cheaper hardware then on applications being secure and stabel so we'll just add some extra capacity to the servers.

If you want to include the adjustment in a later version is something for you to decide. For me i'll start kitting as is now maybe we could change it later if we go to a new version of the kit in a second phase in a year or so.

Thanks a lot for the help, with this last hurdle out of the way we can start rolling out hostmonitor and you should receive a large order in a few weeks.

Bart
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12794
Location: USA

PostPosted: Thu May 12, 2005 2:16 pm    Post subject: Reply with quote

Ok, update available at www.ks-soft.net/download/hm529b.zip
Now HostMonitor sends RecordEvent requests one by one with 50 ms delay between actions. Looks like Windows can handle that amount of requests (at least it works fine on our systems).
I would recommend to install version 5.28 Beta before applying this update.

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    KS-Soft Forum Index -> Configuration, Maintenance, Troubleshooting All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

KS-Soft Forum Index