Bogus Error Messages?
Bogus Error Messages?
I have two machines running HM 5.38, one running Server 2000 SP4, 2-450MHz CPU's and 512mb phy. mem. running about 2000 tests. The other running XP SP2 Professional, dual 2.9 GHz cpu and 1gig phy. mem. running 35 tests.
The one running the 35 tests had no problems with any of the test all night. The one running the 2000 tests including the exact same 35 tests from the other machine reported that these test fail every few minutes.
The Last Replay is showing "An unexpected network error occurred" or "Timed out". The reply time on the faster machine is around 1000ms, the replay time on the slower more loaded machine is around 3000000ms. These are UNC tests on both machines.
I put a sniffer on the slower machine to try and catch these error messages. None of them showed up in my capture files.
The slower machine is scheduled to be upgraded to faster hardware soon, but my question is are these messages generated from the HM application and why does it show up only on a few tests?
The one running the 35 tests had no problems with any of the test all night. The one running the 2000 tests including the exact same 35 tests from the other machine reported that these test fail every few minutes.
The Last Replay is showing "An unexpected network error occurred" or "Timed out". The reply time on the faster machine is around 1000ms, the replay time on the slower more loaded machine is around 3000000ms. These are UNC tests on both machines.
I put a sniffer on the slower machine to try and catch these error messages. None of them showed up in my capture files.
The slower machine is scheduled to be upgraded to faster hardware soon, but my question is are these messages generated from the HM application and why does it show up only on a few tests?
How it works:but my question is are these messages generated from the HM application
1) HostMonitor send request to Windows API: retrieve information about UNC resource
2) Windows sends request to network client
3) Network client tries to establish communication with remote host and returns information about resource or error CODE (e.g. Timeout error - this case is interesting to us)
4) Windows returns error CODE to HostMonitor
5) HostMonitor request Windows to show description of that error code and display test result and error description.
It means sniffer never ever can show you "Timeout" text error. Error happens when there are NO packets received.
In case of some other errors (e.g. authentication error) sniffer will show some packets but usually you will not see any text there (it depends on network protocol).
Regards
Alex
I think Windows reads this message from some resource file.And where does the message: "The specified network name is no longer available." come from?
If you are asking about error code, I think its returned by remote system.
May be HostMonitor performs many requests at the same time and network client cannot process many requests correctly...Another question. I can complete a UNC mount from the cmd prompt in a few seconds, but the HM application takes almost 10 minutes (1010610ms), why is this?
Try to switch "UNC test mode" to "OnePerServer" or "OneByOne". Option located on Misc page in the Options dialog.
Regards
Alex
I upgraded my machine to a Dell Power Edge 1850 running 4-3.0GHz CPU's, 3Gigs of memory on 2003 server. HM ver 5.38. Kind of frequently I receive either one of two error messages from a UNC test: "The specified network name is no longer available" or "Logon failure: unknown user name or bad password" The device that HM is checking is a filer running Data Ontap 6.4.4P7. This filer has a share setup on it called hostmon$. In my UNC test I have the UNC box set to \\165.168.25.217\hostmon$, the Connect as box checked with my domainname\username and a password in the password box. Again I have a sniffer setup on the LAN and I don't see either error message coming back from the filer.
I also have the UNC tests under Misc set to Normal. I'm running over 1500 tests and running OnePerServer or OneByOne did not cut it, also the UNC test retries is set to 1.
For some strange reason out of the 1500 tests, I only receive these error messages for four tests and they just started about a month ago. Any ideas?
One other thing, I setup a batch job that runs once a minute. It's command line is:
net use \\165.168.25.217\hostmon$ /user:domainname\username password
sleep 2
net use \\165.168.25.217\hostmon$ /del
The results are sent to a log file. These tests never fail, even when the HM test fails.
Is this the same command that HM does on it's UNC test?
I also have the UNC tests under Misc set to Normal. I'm running over 1500 tests and running OnePerServer or OneByOne did not cut it, also the UNC test retries is set to 1.
For some strange reason out of the 1500 tests, I only receive these error messages for four tests and they just started about a month ago. Any ideas?
One other thing, I setup a batch job that runs once a minute. It's command line is:
net use \\165.168.25.217\hostmon$ /user:domainname\username password
sleep 2
net use \\165.168.25.217\hostmon$ /del
The results are sent to a log file. These tests never fail, even when the HM test fails.
Is this the same command that HM does on it's UNC test?
Please read my previous posts. QuoteAgain I have a sniffer setup on the LAN and I don't see either error message coming back from the filer.
AK>>It means sniffer never ever can show you "Timeout" text error. Error happens when there are NO packets received.
AK>>In case of some other errors (e.g. authentication error) sniffer will show some packets but usually you will not see any TEXT there because network client returns error CODE, not a text.
Yes, OneByOne cannot be useful in your case. But probably OnePerServer is appropriate setting, unless you have 100 UNC tests for one server.I also have the UNC tests under Misc set to Normal. I'm running over 1500 tests and running OnePerServer or OneByOne did not cut it, also the UNC test retries is set to 1.
HostMonitor does not use command line utilities, it sends requests to network client using Windows API.net use \\165.168.25.217\hostmon$ /del
The results are sent to a log file. These tests never fail, even when the HM test fails.
Is this the same command that HM does on it's UNC test?
Try to use OnePerServer option
As I understand you experience the problem with single device. In this case there is another solution:
- move tests that check problem device into separate folder
- select this folder, click "Properties" button and mark "Non-simultaneously test execution" on "Specials" page.
Regards
Alex
I changed UNC setting to OnePerServer and created a seperate folder with the problem tests (8 total, 2 for each server). I also set to "Non-simultaneously test execution" on "Specials" page. I still see the failures on my slower server, but do not see them on my new faster server. I will let run over night.