Service Tests Failing in HostMonitor 8.86

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Service Tests Failing in HostMonitor 8.86

Post by jaustin »

We have several service monitors set up to watch/restart services on Exchange/Lync services. The services themselves are fine and almost never go down. However, HostMon often reports outage errors with the status of "Unknown" on these servers and alerts us. When we check the services, they're running fine.

If I open the service alert in HostMon and refresh the test, it stays in an Unknown status. If I change the domain user account to another domain user account, and run the test again, it succeeds. If I change it back to the original account, it succeeds again, but will fail again as "unknown" later after some time has passed.

Sometimes the reply column shows "Win32 error #5", which I believe indicates an access problem, but the same account can access the service just fine once I change the account and then change it back. Sometimes the reply column doesn't show that error and merely says "Unknown" (I believe).

Does anyone have any thoughts on this?
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

Windows error code #5 means "Access is denied".
That means HostMonitor is able to connect to remote system,
however, it cannot retrieve necessary information from target system.

Please, check Windows Event log on remote system for failure logon events when problem occurs.

Also, some services work unstable (sometimes respond to commands, sometimes do not), you may setup HostMonitor to start actions after 2nd or 3rd consecutive "bad" test result. Use "Start when 2 consecutive Bad results occur" option:
http://www.ks-soft.net/hostmon.eng/mfra ... #StartWhen

When "Start when N consecutive Bad results occur" option is enabled, we recommend to use "Action depends on Bad one" option for "good" action assigned to the same profile:
http://www.ks-soft.net/hostmon.eng/mfra ... pendsOnBad
Quote from the manual
==========================
Action depends on "bad" one
This optional parameter is available for "Good" actions only. You can set "Good" action dependable on a "Bad" action. Why do you need it? For example you defined "Bad" action to send an e-mail notification to the network administrator when test fails 3 times consecutively (start when 3 consecutive "Bad" results occur), also you defined «Good» action to send a notification when the test status changes to "Good". What will happen if test fails 1 or 2 times and after this it restores "Good" status? HostMonitor will not send a notification about failure (because test did not fail 3 times) but the program will send notification about restoring "Good" status. To avoid unnecessary "Good" action execution you can mark "Action depends on "bad" one" option and select "Bad" action. In this case HostMonitor will start "Good" action only if corresponding "Bad" action was executed.
===========================
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Post by jaustin »

Thank you for the response.

There are no logon/authentication errors logged on the target system when this happens. This does not seem to be a problem with the target system; it is just an inconsistency on the part of HostMonitor, as far as we can tell.

Setting up the alerts to only start when N consecutive bad results occur sounds like a possible workaround, but it does not solve the problem. We have actually tried this approach. HostMon will continue to show an "Unknown" status (and thus generate an alert even after several checks) until we manually change the "Connect As" account and then change it back. I hope that makes sense.
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

May be not all necessary audit policies are enabled?
Could you check if "Audit account logon events" and "Audit login events" policies are enabled and set to "Success and Failure" ?

Also, it can be a firewall issue.
Is there any firewall or antivirus monitor installed between HostMonitor and remote server ?
Could you try to disable it ?

If such problem appears on single system, you may install RMA agent as workaround for any Windows related problems:
http://www.ks-soft.net/hostmon.eng/rma-win/index.htm
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Post by jaustin »

KS-Soft Europe wrote:May be not all necessary audit policies are enabled?
Could you check if "Audit account logon events" and "Audit login events" policies are enabled and set to "Success and Failure" ?

Audit account logon events is set to Failure
Audit logon events is set to Success, Failure

Also, it can be a firewall issue.
Is there any firewall or antivirus monitor installed between HostMonitor and remote server ?
Could you try to disable it ?
There is no firewall between the systems. All systems are behind our corporate firewall.
If such problem appears on single system, you may install RMA agent as workaround for any Windows related problems:
http://www.ks-soft.net/hostmon.eng/rma-win/index.htm
The problem has occurred with multiple services across multiple systems, some more frequently than others. Unless I'm mistaken, it appears that RMA is another product I would have to purchase, and I'm not prepared to do that as HostMon should work as-is.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

The problem has occurred with multiple services across multiple systems, some more frequently than others. Unless I'm mistaken, it appears that RMA is another product I would have to purchase, and I'm not prepared to do that as HostMon should work as-is.
RMA is part of Advanced Host Monitor package, included into Professional and Enterprise license
http://www.ks-soft.net/hostmon.eng/regmon.htm#newprice
The problem has occurred with multiple services across multiple systems, some more frequently than others.
In such case single agent will not help you. Anyway its better to find reason of the problem but this can be hard...
I don't think this is HostMonitor fault, we and some customers seen such "Acess denied" errors for years, always caused by Windows or some 3rd party software.

Strange you do not see any messages in event log. :roll: You have domain, not workgroup? Have you checked domain controller event log?
What Windows do you have installed on HostMonitor system and remote systems? Service Pack? Antivirus monitors?

Regards
Alex
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Post by jaustin »

I'm not sure I've been clear. When this error/alert happens, the service is fine. It can be reached from the HostMon server fine. There are no access issues. However, I can refresh the alert over and over and it continues to return as "Unknown".

It stays in the "Unknown" state until I change the user account in HostMon and refresh the test. Then it returns to "Ok". I change the user back to what it was before and refresh again, and it still says "Ok", even though that same account was returning "Unknown" over and over again previously.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

When this error/alert happens, the service is fine. It can be reached from the HostMon server fine. There are no access issues. However, I can refresh the alert over and over and it continues to return as "Unknown"
That's because HostMonitor cannot reach target system.
It stays in the "Unknown" state until I change the user account in HostMon and refresh the test. Then it returns to "Ok". I change the user back to what it was before and refresh again, and it still says "Ok", even though that same account was returning "Unknown" over and over again previously.
That's because connection already established. When connection established, you may use any account or no account at all, test will work.

Question is why 1st account does not work and why you do not see any records in event log?
Also, if 2nd account works fine, why you do not use it?
You have domain, not workgroup? Have you checked domain controller event log?
What Windows do you have installed on HostMonitor system and remote systems? Service Pack? Antivirus monitors?
What exactly option do you use to specify account? "Connect as" test property or Connection Manager?

Regards
Alex
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Post by jaustin »

KS-Soft wrote: That's because HostMonitor cannot reach target system.
That's incorrect; it can reach the target system.
That's because connection already established. When connection established, you may use any account or no account at all, test will work.
No, if I change back to the account that wasn't previously working, it says "Ok" and continues to say "Ok" no matter how many times I refresh it.
Question is why 1st account does not work and why you do not see any records in event log?
Because it does not seem to be a problem with Windows, and therefore would not be reflected in the target machine's log.
Also, if 2nd account works fine, why you do not use it?
Because it's not a problem with the account. Whichever account I use, it eventually reaches this state. The actual account being used seems to be largely irrelevant. It is only fixed by switching to any second working account. Then I can switch to any other account, or back to the original account, and it continues to work.
You have domain, not workgroup? Have you checked domain controller event log?
We use a domain. There are no errors for this on the DC event log either.
What Windows do you have installed on HostMonitor system and remote systems? Service Pack?
The HostMonitor system is on Windows Server 2003 Standard SP2.
The target system in this case is running Windows Server 2008 R2 Enterprise.
Antivirus monitors?
VIPRE Antivirus
What exactly option do you use to specify account? "Connect as" test property or Connection Manager?
I have tried both options.

Note that I seem to be able to reproduce this at will by restarting the HostMonitor server. At first all tests show fine. But if I then refresh the test, that one fails again until I change the account, refresh, and change the account back, and refresh again.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

That's because HostMonitor cannot reach target system
...
That's incorrect; it can reach the target system
No, it cannot reach the target system because Windows function OpenSCManager fails with error code #5 that means "Access denied".
If HostMonitor could connect to Service Manager, it would set status Ok or Bad (depends on reply from monitored service).
That's because connection already established. When connection established, you may use any account or no account at all, test will work.
..
No, if I change back to the account that wasn't previously working, it says "Ok" and continues to say "Ok" no matter how many times I refresh it
Correct. That's because connection already established. When connection established, you may use any account or no account at all, test will work.
Because it does not seem to be a problem with Windows, and therefore would not be reflected in the target machine's log
If its not Windows problem and its not HostMonitor problem, what is the reason of this problem??
HostMonitor code for Service test is very simple, this code was not changed for years and works fine everywhere. I am sure there is no mistake in this code however we will re-check it today.

If Windows is out of resources, this may lead to various strange effects. Could you check resource usage for each process? You may use standard Windows Task Manager to check Handles, GDI and USER objects. What is the total resource usage on the system? How many handles/threads/GDI objects used by hostmon.exe process?

Also, could you try to disable your antivirus? We are not familiar with VIPRE Antivirus but Symantec/McAfee/NOD32 antiviruses pretty often lead to various problems.
What exactly option do you use to specify account? "Connect as" test property or Connection Manager?
...
I have tried both options
HostMonitor uses the same Windows API in both cases but Connection Manager offers 2 extra options:
- Log failed attempts
- Reconnect if necessary
Could you enable logging and check the log?
Could you set "Reconnect if necessary" option?

Regards
Alex
MikaelK
Posts: 6
Joined: Wed Jan 02, 2008 9:21 am

Post by MikaelK »

I would like to jump in here...

We also experience those same periodic "unknown" status on services test, on remote windows servers (in remote AD's with accounts configurered in Connection manager).

As jaustin has experienced, status just suddenly change from OK to unknown with "Access denied"...
Sometimes status changes back to OK after several minutes/tests by it self.. Some times we have to reboot the server where Hostmon resides... (resets the login-mechanism in hostmon??)

Moreover, we are experincing bigger and bigger difficulties in getting new services test to work correctly against specially win2008r2, but continuos keep getting error #5, while seeing successfull login in security-log on target-servers...?!

Just as jaustin, we tend to see this as an issue/challenge with hostmon, rather than issue with the used credentials...

Regards
Mikael K.
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Post by jaustin »

KS-Soft wrote: HostMonitor uses the same Windows API in both cases but Connection Manager offers 2 extra options:
- Log failed attempts
- Reconnect if necessary
Could you enable logging and check the log?
Could you set "Reconnect if necessary" option?

Regards
Alex
I will try setting this up and monitor.
jaustin
Posts: 10
Joined: Thu Aug 25, 2011 2:21 pm

Post by jaustin »

Unfortunately that didn't accomplish anything. I restart the server, and the test shows up fine. I don't change anything, refresh the test, and it changes to Unknown. No log file is created (I assume because it didn't technically "fail").
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Just as jaustin, we tend to see this as an issue/challenge with hostmon, rather than issue with the used credentials...
We checked our code once more, it looks fine.
Also, I do not think this is credentials/account related problem.
We think there is some bug outside of HostMonitor. We are trying to reproduce the problem using Windows 2003 and Windows 2008 systems....

Regards
Alex
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

We are testing several Windows 2003 and Windows 2008 systems and we cannot reproduce this problem so far. BTW: we always test new versions of HostMonitor using these systems, plus Windows XP, Windows Vista, Windows 7 systems.
Do you have UAC enabled on Windows 2008 system?

Regards
Alex
Post Reply