hostmonitor freezing

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

PS Just to clarify: are you working in the same team (losisoft and hsq) with the same instance of HostMonitor? or may be we are looking for 2 different problems on different systems?

Regards
Alex
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

Well, mos-eisley gave his two cents, I'm gonna toss in mine.

If this were my issue I'd start a divide and conquer approach. Using another box I'd install a fresh 7.5 load, export sections of my testing to that box, and validate, adding additional tests until it breaks. If the test box won't handle the full load, swap exports in/out as needed.
If it never breaks then the focus should be your 'new' environment.

Meanwhile, depending on your test list, you could start playing with some of the 'performance' options. For instance, the [Options]\Behavior\ selection for "Don't start more than {xx} tests per second" as well as other specific limits that can be applied, such as the SNMP traps and Performance Counter tests limits on the \Miscellaneous\ tab. If they don't fix the issue they may reveal something if the frequency of freezing changes.

Other thoughts:
I know you've checked and rechecked, but ODBC is still a common suspect.
Could the 'freeze' be associated with an action taken instead of a test performed? The reason for the randomness may be due to an automated action kicking in. Off the top of my head, I'm not sure how you would debug that, but it's something to think about. Maybe check your logs to see if a failure time syncs with the freeze.
Are you moving from a single core to a multi-core processor? Could it be an issue with affinity. You could try forcing processor affinity.

Alex,
This is likely a moot point and I'm sure this has been tested in multi-core environments, but to satisfy my own curiosity, do you progamatically select a core (a crap shoot I know :( ). And what about multi-threaded tasks, scripts and external calls - are they 'tied' to the applications core or allowed to run independently and auto-select.
Also, could there be a 'bug' associated with processor make (Intel, AMD) or model? (In case they've changed processor platforms)
I only ask these things because, from the sound of it, the only new thing is the hardware involved.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

This is likely a moot point and I'm sure this has been tested in multi-core environments
Yes, we are testing HostMonitor on such systems.
do you progamatically select a core
No, HostMonitor does not use some special procedures to manage threads. Windows does this.
what about multi-threaded tasks, scripts and external calls - are they 'tied' to the applications core or allowed to run independently and auto-select
HostMonitor decide what test should be performed and starts thread. Each test performed by separate thread that returns results into main thread.
Also, could there be a 'bug' associated with processor make (Intel, AMD) or model? (In case they've changed processor platforms)
Everything is possible in this world :roll: About 4-5 years ago several customers experienced problem with one model of Intel CPU. Simple ariphmetical operations on this CPU could return different result than other similar 32bit Intel CPU :o

Regards
Alex
losisoft
Posts: 43
Joined: Fri Mar 21, 2008 4:02 am

Post by losisoft »

KS-Soft wrote:PS Just to clarify: are you working in the same team (losisoft and hsq) with the same instance of HostMonitor?
Yes, we are colleges. :)

KS-Soft wrote::-?
So, we return to begining? :(
What about ODBC and antivirus monitor?
Currently we move away all the ODBC queries which where running from hostmon to agents. I guess with that we can figure out if it's ODBC releated or not.
hsq
Posts: 15
Joined: Thu Jun 28, 2007 8:23 am

Post by hsq »

Hi Alex,

Much appreciate all your efforts.

In order to narrowing the problem I try to push down all our ODBC tests to the passive agents and do not let the hostmonitor itself executing any of them.

In case of the ODBC driver having a problem/bug, does this step helps?

In my logic it can be a "workaround" as the process will not use the local oracle client. In worst case it is ending in an RMA freeze but not in a stucked hostmonitor process.

All the local antivirus components are stopped since 3 days so it is not influencing this game at all.

Regards,
Gabor
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

In order to narrowing the problem I try to push down all our ODBC tests to the passive agents and do not let the hostmonitor itself executing any of them.
In case of the ODBC driver having a problem/bug, does this step helps?
If you still using ODBC logging, then this will not help.
If you cannot disable ODBC logging, could you start anoher "test" instance of HostMonitor without ODBC logging?
E.g. could you do the following
- copy entire HostMonitor folder (e.g. "c:\program files\hostmonitor -> c:\test\hostmonitor)
- start hostmonitor (copy) using "hostmon.exe /stop" command line parameter
- disable logging (Options dialog)
- disable actions (menu Monitoring -> Disable)
- load copied HML file (e.g. c:\test\hostmonitor\main.hml)
- start monitoring
As result you will have 2 instances of HostMonitor: production copy and testing copy that can be modified at any time without problems.
In my logic it can be a "workaround" as the process will not use the local oracle client. In worst case it is ending in an RMA freeze but not in a stucked hostmonitor process.
Yes, RMA is very helpul to narrow some problem when the problem caused by some test method. Its pretty easy to change "test by" property for set of tests.

Regards
Alex
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Do you have installed Microsoft Windows Search 4.0?
May be it leads to some problems... we are not sure yet.

Regards
Alex
losisoft
Posts: 43
Joined: Fri Mar 21, 2008 4:02 am

Post by losisoft »

KS-Soft wrote:Do you have installed Microsoft Windows Search 4.0?
May be it leads to some problems... we are not sure yet.

Regards
Alex
Hi Alex,

No, we don't use Windows search.

We have changed how we log into MS-SQL, and that stabilized the system more or less. We still have some hanging, but it's like 1-2x a day.

Regards,
Jozsef
losisoft
Posts: 43
Joined: Fri Mar 21, 2008 4:02 am

Post by losisoft »

We have narrowed down the problem to the Oracle 10g client. Even with the latest release, we still had the problem.
We downgraded the client to Oracle 9. And problem have disappeared. :D
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

ODBC driver was changed as well, right?
ODBC drivers often lead to problems :(
We plan to implement module for direct connection (HostMonitor -> SQL client -> SQL Server) in version 8. Probably it will support Oracle, MS SQL and MySQL servers

Regards
Alex
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

DIRECT CONNECTION!
Alex, you just succeeded in making me to drool all over myself :lol:
What's the expected release date for v8?
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

I hope we release version 7.70 Beta this month. Then several weeks for release, then version 7.80 and may be 7.90
I think we will start version 8 development in January and release not erlier than March 2009.

Regards
Alex
losisoft
Posts: 43
Joined: Fri Mar 21, 2008 4:02 am

Post by losisoft »

KS-Soft wrote:ODBC driver was changed as well, right?
ODBC drivers often lead to problems :(
Yes Alex, replacing the Oracle client, replaced the ODBC driver as well.
Post Reply