Service Tests Failing in HostMonitor 8.86
Service Tests Failing in HostMonitor 8.86
We have several service monitors set up to watch/restart services on Exchange/Lync services. The services themselves are fine and almost never go down. However, HostMon often reports outage errors with the status of "Unknown" on these servers and alerts us. When we check the services, they're running fine.
If I open the service alert in HostMon and refresh the test, it stays in an Unknown status. If I change the domain user account to another domain user account, and run the test again, it succeeds. If I change it back to the original account, it succeeds again, but will fail again as "unknown" later after some time has passed.
Sometimes the reply column shows "Win32 error #5", which I believe indicates an access problem, but the same account can access the service just fine once I change the account and then change it back. Sometimes the reply column doesn't show that error and merely says "Unknown" (I believe).
Does anyone have any thoughts on this?
If I open the service alert in HostMon and refresh the test, it stays in an Unknown status. If I change the domain user account to another domain user account, and run the test again, it succeeds. If I change it back to the original account, it succeeds again, but will fail again as "unknown" later after some time has passed.
Sometimes the reply column shows "Win32 error #5", which I believe indicates an access problem, but the same account can access the service just fine once I change the account and then change it back. Sometimes the reply column doesn't show that error and merely says "Unknown" (I believe).
Does anyone have any thoughts on this?
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
Windows error code #5 means "Access is denied".
That means HostMonitor is able to connect to remote system,
however, it cannot retrieve necessary information from target system.
Please, check Windows Event log on remote system for failure logon events when problem occurs.
Also, some services work unstable (sometimes respond to commands, sometimes do not), you may setup HostMonitor to start actions after 2nd or 3rd consecutive "bad" test result. Use "Start when 2 consecutive Bad results occur" option:
http://www.ks-soft.net/hostmon.eng/mfra ... #StartWhen
When "Start when N consecutive Bad results occur" option is enabled, we recommend to use "Action depends on Bad one" option for "good" action assigned to the same profile:
http://www.ks-soft.net/hostmon.eng/mfra ... pendsOnBad
Quote from the manual
==========================
Action depends on "bad" one
This optional parameter is available for "Good" actions only. You can set "Good" action dependable on a "Bad" action. Why do you need it? For example you defined "Bad" action to send an e-mail notification to the network administrator when test fails 3 times consecutively (start when 3 consecutive "Bad" results occur), also you defined «Good» action to send a notification when the test status changes to "Good". What will happen if test fails 1 or 2 times and after this it restores "Good" status? HostMonitor will not send a notification about failure (because test did not fail 3 times) but the program will send notification about restoring "Good" status. To avoid unnecessary "Good" action execution you can mark "Action depends on "bad" one" option and select "Bad" action. In this case HostMonitor will start "Good" action only if corresponding "Bad" action was executed.
===========================
That means HostMonitor is able to connect to remote system,
however, it cannot retrieve necessary information from target system.
Please, check Windows Event log on remote system for failure logon events when problem occurs.
Also, some services work unstable (sometimes respond to commands, sometimes do not), you may setup HostMonitor to start actions after 2nd or 3rd consecutive "bad" test result. Use "Start when 2 consecutive Bad results occur" option:
http://www.ks-soft.net/hostmon.eng/mfra ... #StartWhen
When "Start when N consecutive Bad results occur" option is enabled, we recommend to use "Action depends on Bad one" option for "good" action assigned to the same profile:
http://www.ks-soft.net/hostmon.eng/mfra ... pendsOnBad
Quote from the manual
==========================
Action depends on "bad" one
This optional parameter is available for "Good" actions only. You can set "Good" action dependable on a "Bad" action. Why do you need it? For example you defined "Bad" action to send an e-mail notification to the network administrator when test fails 3 times consecutively (start when 3 consecutive "Bad" results occur), also you defined «Good» action to send a notification when the test status changes to "Good". What will happen if test fails 1 or 2 times and after this it restores "Good" status? HostMonitor will not send a notification about failure (because test did not fail 3 times) but the program will send notification about restoring "Good" status. To avoid unnecessary "Good" action execution you can mark "Action depends on "bad" one" option and select "Bad" action. In this case HostMonitor will start "Good" action only if corresponding "Bad" action was executed.
===========================
Thank you for the response.
There are no logon/authentication errors logged on the target system when this happens. This does not seem to be a problem with the target system; it is just an inconsistency on the part of HostMonitor, as far as we can tell.
Setting up the alerts to only start when N consecutive bad results occur sounds like a possible workaround, but it does not solve the problem. We have actually tried this approach. HostMon will continue to show an "Unknown" status (and thus generate an alert even after several checks) until we manually change the "Connect As" account and then change it back. I hope that makes sense.
There are no logon/authentication errors logged on the target system when this happens. This does not seem to be a problem with the target system; it is just an inconsistency on the part of HostMonitor, as far as we can tell.
Setting up the alerts to only start when N consecutive bad results occur sounds like a possible workaround, but it does not solve the problem. We have actually tried this approach. HostMon will continue to show an "Unknown" status (and thus generate an alert even after several checks) until we manually change the "Connect As" account and then change it back. I hope that makes sense.
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
May be not all necessary audit policies are enabled?
Could you check if "Audit account logon events" and "Audit login events" policies are enabled and set to "Success and Failure" ?
Also, it can be a firewall issue.
Is there any firewall or antivirus monitor installed between HostMonitor and remote server ?
Could you try to disable it ?
If such problem appears on single system, you may install RMA agent as workaround for any Windows related problems:
http://www.ks-soft.net/hostmon.eng/rma-win/index.htm
Could you check if "Audit account logon events" and "Audit login events" policies are enabled and set to "Success and Failure" ?
Also, it can be a firewall issue.
Is there any firewall or antivirus monitor installed between HostMonitor and remote server ?
Could you try to disable it ?
If such problem appears on single system, you may install RMA agent as workaround for any Windows related problems:
http://www.ks-soft.net/hostmon.eng/rma-win/index.htm
KS-Soft Europe wrote:May be not all necessary audit policies are enabled?
Could you check if "Audit account logon events" and "Audit login events" policies are enabled and set to "Success and Failure" ?
Audit account logon events is set to Failure
Audit logon events is set to Success, Failure
There is no firewall between the systems. All systems are behind our corporate firewall.Also, it can be a firewall issue.
Is there any firewall or antivirus monitor installed between HostMonitor and remote server ?
Could you try to disable it ?
The problem has occurred with multiple services across multiple systems, some more frequently than others. Unless I'm mistaken, it appears that RMA is another product I would have to purchase, and I'm not prepared to do that as HostMon should work as-is.If such problem appears on single system, you may install RMA agent as workaround for any Windows related problems:
http://www.ks-soft.net/hostmon.eng/rma-win/index.htm
RMA is part of Advanced Host Monitor package, included into Professional and Enterprise licenseThe problem has occurred with multiple services across multiple systems, some more frequently than others. Unless I'm mistaken, it appears that RMA is another product I would have to purchase, and I'm not prepared to do that as HostMon should work as-is.
http://www.ks-soft.net/hostmon.eng/regmon.htm#newprice
In such case single agent will not help you. Anyway its better to find reason of the problem but this can be hard...The problem has occurred with multiple services across multiple systems, some more frequently than others.
I don't think this is HostMonitor fault, we and some customers seen such "Acess denied" errors for years, always caused by Windows or some 3rd party software.
Strange you do not see any messages in event log.

What Windows do you have installed on HostMonitor system and remote systems? Service Pack? Antivirus monitors?
Regards
Alex
I'm not sure I've been clear. When this error/alert happens, the service is fine. It can be reached from the HostMon server fine. There are no access issues. However, I can refresh the alert over and over and it continues to return as "Unknown".
It stays in the "Unknown" state until I change the user account in HostMon and refresh the test. Then it returns to "Ok". I change the user back to what it was before and refresh again, and it still says "Ok", even though that same account was returning "Unknown" over and over again previously.
It stays in the "Unknown" state until I change the user account in HostMon and refresh the test. Then it returns to "Ok". I change the user back to what it was before and refresh again, and it still says "Ok", even though that same account was returning "Unknown" over and over again previously.
That's because HostMonitor cannot reach target system.When this error/alert happens, the service is fine. It can be reached from the HostMon server fine. There are no access issues. However, I can refresh the alert over and over and it continues to return as "Unknown"
That's because connection already established. When connection established, you may use any account or no account at all, test will work.It stays in the "Unknown" state until I change the user account in HostMon and refresh the test. Then it returns to "Ok". I change the user back to what it was before and refresh again, and it still says "Ok", even though that same account was returning "Unknown" over and over again previously.
Question is why 1st account does not work and why you do not see any records in event log?
Also, if 2nd account works fine, why you do not use it?
You have domain, not workgroup? Have you checked domain controller event log?
What Windows do you have installed on HostMonitor system and remote systems? Service Pack? Antivirus monitors?
What exactly option do you use to specify account? "Connect as" test property or Connection Manager?
Regards
Alex
That's incorrect; it can reach the target system.KS-Soft wrote: That's because HostMonitor cannot reach target system.
No, if I change back to the account that wasn't previously working, it says "Ok" and continues to say "Ok" no matter how many times I refresh it.That's because connection already established. When connection established, you may use any account or no account at all, test will work.
Because it does not seem to be a problem with Windows, and therefore would not be reflected in the target machine's log.Question is why 1st account does not work and why you do not see any records in event log?
Because it's not a problem with the account. Whichever account I use, it eventually reaches this state. The actual account being used seems to be largely irrelevant. It is only fixed by switching to any second working account. Then I can switch to any other account, or back to the original account, and it continues to work.Also, if 2nd account works fine, why you do not use it?
We use a domain. There are no errors for this on the DC event log either.You have domain, not workgroup? Have you checked domain controller event log?
The HostMonitor system is on Windows Server 2003 Standard SP2.What Windows do you have installed on HostMonitor system and remote systems? Service Pack?
The target system in this case is running Windows Server 2008 R2 Enterprise.
VIPRE AntivirusAntivirus monitors?
I have tried both options.What exactly option do you use to specify account? "Connect as" test property or Connection Manager?
Note that I seem to be able to reproduce this at will by restarting the HostMonitor server. At first all tests show fine. But if I then refresh the test, that one fails again until I change the account, refresh, and change the account back, and refresh again.
No, it cannot reach the target system because Windows function OpenSCManager fails with error code #5 that means "Access denied".That's because HostMonitor cannot reach target system
...
That's incorrect; it can reach the target system
If HostMonitor could connect to Service Manager, it would set status Ok or Bad (depends on reply from monitored service).
Correct. That's because connection already established. When connection established, you may use any account or no account at all, test will work.That's because connection already established. When connection established, you may use any account or no account at all, test will work.
..
No, if I change back to the account that wasn't previously working, it says "Ok" and continues to say "Ok" no matter how many times I refresh it
If its not Windows problem and its not HostMonitor problem, what is the reason of this problem??Because it does not seem to be a problem with Windows, and therefore would not be reflected in the target machine's log
HostMonitor code for Service test is very simple, this code was not changed for years and works fine everywhere. I am sure there is no mistake in this code however we will re-check it today.
If Windows is out of resources, this may lead to various strange effects. Could you check resource usage for each process? You may use standard Windows Task Manager to check Handles, GDI and USER objects. What is the total resource usage on the system? How many handles/threads/GDI objects used by hostmon.exe process?
Also, could you try to disable your antivirus? We are not familiar with VIPRE Antivirus but Symantec/McAfee/NOD32 antiviruses pretty often lead to various problems.
HostMonitor uses the same Windows API in both cases but Connection Manager offers 2 extra options:What exactly option do you use to specify account? "Connect as" test property or Connection Manager?
...
I have tried both options
- Log failed attempts
- Reconnect if necessary
Could you enable logging and check the log?
Could you set "Reconnect if necessary" option?
Regards
Alex
I would like to jump in here...
We also experience those same periodic "unknown" status on services test, on remote windows servers (in remote AD's with accounts configurered in Connection manager).
As jaustin has experienced, status just suddenly change from OK to unknown with "Access denied"...
Sometimes status changes back to OK after several minutes/tests by it self.. Some times we have to reboot the server where Hostmon resides... (resets the login-mechanism in hostmon??)
Moreover, we are experincing bigger and bigger difficulties in getting new services test to work correctly against specially win2008r2, but continuos keep getting error #5, while seeing successfull login in security-log on target-servers...?!
Just as jaustin, we tend to see this as an issue/challenge with hostmon, rather than issue with the used credentials...
Regards
Mikael K.
We also experience those same periodic "unknown" status on services test, on remote windows servers (in remote AD's with accounts configurered in Connection manager).
As jaustin has experienced, status just suddenly change from OK to unknown with "Access denied"...
Sometimes status changes back to OK after several minutes/tests by it self.. Some times we have to reboot the server where Hostmon resides... (resets the login-mechanism in hostmon??)
Moreover, we are experincing bigger and bigger difficulties in getting new services test to work correctly against specially win2008r2, but continuos keep getting error #5, while seeing successfull login in security-log on target-servers...?!
Just as jaustin, we tend to see this as an issue/challenge with hostmon, rather than issue with the used credentials...
Regards
Mikael K.
I will try setting this up and monitor.KS-Soft wrote: HostMonitor uses the same Windows API in both cases but Connection Manager offers 2 extra options:
- Log failed attempts
- Reconnect if necessary
Could you enable logging and check the log?
Could you set "Reconnect if necessary" option?
Regards
Alex
We checked our code once more, it looks fine.Just as jaustin, we tend to see this as an issue/challenge with hostmon, rather than issue with the used credentials...
Also, I do not think this is credentials/account related problem.
We think there is some bug outside of HostMonitor. We are trying to reproduce the problem using Windows 2003 and Windows 2008 systems....
Regards
Alex