Hello KS-Soft,
two days ago Hostmon alerted about all our Server test (Ping check) that they were not reachable (Status: unknown; Reply Timed-out), but that was wrong. All servers were ok, but Hostmon wasn't able to perform the check right. Our Admin group got 200 emails and they were very confused! For the next time we don't want alerting about Status "unknown". How can i configure the test and/or Alert Profile right?
Thx and Regard.
Wrong Alerting - Status "unknown" and Result "
-
- Posts: 25
- Joined: Fri Apr 18, 2008 5:17 am
- Contact:
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
You should just uncheck "Treat Unknown status as Bad" option in "Test Properties" window for each test. Please note: if you need to change this option for group of tests, you may select all necessary test, click "Edit" button on toolbar and change "Treat Unknown status as Bad" option out.
But anyway, "Ping" tests with "Unknown" status and "Timed out" error is pretty uncommon sittuation.
Could you provide more information, please?
- What version of HostMonitor do you use?
- What Windows do you use? Service Pack?
- Is HostMonitor started as a service or as an application?
- Do you have installed antivirus monitor? Personal firewall? Content monitoring software? Non-standard winsock components? Network packet analyzer?
- Do you see any error messages in Event Viewer (Start > Settings > Control Panel > Administrative Tools > Event Viewer applet)?
- Do you see any error messages in System Log (file is specified in menu "Options" -> "System Log"). You may access the System Log using menu "View" - > "System Log".
- How many "Ping" tests do you use? What is time interval for these tests?
Regards,
Max
But anyway, "Ping" tests with "Unknown" status and "Timed out" error is pretty uncommon sittuation.
Could you provide more information, please?
- What version of HostMonitor do you use?
- What Windows do you use? Service Pack?
- Is HostMonitor started as a service or as an application?
- Do you have installed antivirus monitor? Personal firewall? Content monitoring software? Non-standard winsock components? Network packet analyzer?
- Do you see any error messages in Event Viewer (Start > Settings > Control Panel > Administrative Tools > Event Viewer applet)?
- Do you see any error messages in System Log (file is specified in menu "Options" -> "System Log"). You may access the System Log using menu "View" - > "System Log".
- How many "Ping" tests do you use? What is time interval for these tests?
Regards,
Max
-
- Posts: 25
- Joined: Fri Apr 18, 2008 5:17 am
- Contact:
Thx for the hint. I've discovered that i can expand the "optional status processing"
How can i mass-change all of my Ping-Tests?
Here are the answers to your questions:
- What version of HostMonitor do you use?
8.00
- What Windows do you use? Service Pack?
Windows Server 2003 SP2
- Is HostMonitor started as a service or as an application?
Service
- Do you have installed antivirus monitor? Personal firewall? Content monitoring software? Non-standard winsock components? Network packet analyzer?
Antivirus: Microsoft Forefront / None from the other mentioned tools
- Do you see any error messages in Event Viewer (Start > Settings > Control Panel > Administrative Tools > Event Viewer applet)?
Last Entry 29.5., the mass -false - alert occurs at 31.5.09 6:00 pm
- Do you see any error messages in System Log (file is specified in menu "Options" -> "System Log"). You may access the System Log using menu "View" - > "System Log".
Yes indeed! A very lot of this messages:
[31.05.2009 18:10:58] Sys Error: Cannot send e-mail to systemstoerung.hoch@eha.net. 500 ESocketException: No buffer space available (#10055 in WSAAsyncSelect)
Trying to use backup SMTP server..
- How many "Ping" tests do you use? What is time interval for these tests?
Approx. 200, intervall is 10 minutes
Thx and regards!

How can i mass-change all of my Ping-Tests?
Here are the answers to your questions:
- What version of HostMonitor do you use?
8.00
- What Windows do you use? Service Pack?
Windows Server 2003 SP2
- Is HostMonitor started as a service or as an application?
Service
- Do you have installed antivirus monitor? Personal firewall? Content monitoring software? Non-standard winsock components? Network packet analyzer?
Antivirus: Microsoft Forefront / None from the other mentioned tools
- Do you see any error messages in Event Viewer (Start > Settings > Control Panel > Administrative Tools > Event Viewer applet)?
Last Entry 29.5., the mass -false - alert occurs at 31.5.09 6:00 pm
- Do you see any error messages in System Log (file is specified in menu "Options" -> "System Log"). You may access the System Log using menu "View" - > "System Log".
Yes indeed! A very lot of this messages:
[31.05.2009 18:10:58] Sys Error: Cannot send e-mail to systemstoerung.hoch@eha.net. 500 ESocketException: No buffer space available (#10055 in WSAAsyncSelect)
Trying to use backup SMTP server..
- How many "Ping" tests do you use? What is time interval for these tests?
Approx. 200, intervall is 10 minutes
Thx and regards!
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
Hm. It could be a cause of "Ping" tests problem.Wolfgang Bach wrote:Yes indeed! A very lot of this messages:
[31.05.2009 18:10:58] Sys Error: Cannot send e-mail to systemstoerung.hoch@eha.net. 500 ESocketException: No buffer space available (#10055 in WSAAsyncSelect)
Trying to use backup SMTP server..
The Error 10055 error is caused when all available IP ports are used up. Usually cause during TIME_WAIT phase of a connection being too long. After a connection is finished, the port sits in this state for a while. If a new port is needed, this port will be unavailable.
Below are several settings you may change on your system. TcpTimedWaitDelay is exactly responsible for TIME_WAIT timeout.
=====================================
There is a parameter that limits the maximum number of connections that TCP may have open simultaneously.
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \Tcpip \Parameters]
TcpNumConnections = 0x00fffffe (Default = 16,777,214)
Note a 16 Million connection limit sounds very promising, but there are other parameters (See below), which keeps us from ever reaching this limit.
When a client makes a connect() call to make a connection to a server, then the client invisible/implicit bind the socket to a local dynamic (anonymous, ephemeral, short-lived) port number. The default range for dynamic ports in Windows is 1024 to 5000, thus giving 3977 outbound concurrent connections for each IP Address. It is possible to change the upper limit with this DWORD registry key:
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \Tcpip \Parameters]
MaxUserPort = 5000 (Default = 5000, Max = 65534)
Even when not having 3977 concurrent connections for each IP Address, then it is still possible to run out of available port numbers or TCB's. This can happen if quickly opening and closing connections, because after a connection is "closed" it enters the state TIME_WAIT, and will continue to occupy the port number for 4 minutes (2*Maximum Segment Live, MSL) before it is actually removed. This behavior is specified in RFC 793, and prevents attempts to reconnect to the same party, before the old socket is recognized as closed at both sides. It is possible to change how long a socket should be in TIME_WAIT state before it can be re-used freely:
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \services \Tcpip \Parameters]
TcpTimedWaitDelay = 120 (Default = 240 secs, Range = 30-300)
Note with Win2k the reuse of sockets have been changed, so when reaching the limit of more than 1000 connections in TIME-WAIT state, then it starts to mark sockets that have been in TIME_WAIT state for morethan 60 secs as free. It is possible to configure this limit:
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \services \Tcpip \Parameters]
MaxFreeTWTcbs = 1000 (Default = 1000 sockets)
Note with Win2k3 SP1 the reuse of sockets have been changed, so when it has to re-use sockets in TIME_WAIT state, then it checks whether the other party is different from the old socket. Eliminating the need to fiddle with (TcpTimedWaitDelay) and (MaxFreeTWTcbs) any more.
If using an application protocol that doesn't implement timeout checking, but relies on the TCPIP timeout checking without specifying how often it should be done, then it is possible to get connections that "never" closes, if the remote host disconnects without closing the connection properly. The TCPIP timeout checking is by default done every 2 hour, by sending a keep alive packet. It is possible to change how often TCPIP should check the connections (Affects all TCPIP connections):
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \services \Tcpip \Parameters]
KeepAliveTime = 1800000 (Default = 7,200,000 milisecs)
For each connection a TCP Control Block (TCB - Data structure using 0.5 KB pagepool and 0.5 KB non-pagepool) is maintained. The TCBs are pre-allocated and stored in a table, to avoid spending time on allocating/deallocating the TCBs every time connections are created/closed. The TCB Table enables reuse/caching of TCBs and improves memory management, but the static size limits how many connections TCP can support simultaneously (Active + TIME_WAIT). Configure the size of the TCB Table with this DWORD registry key:
[HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \Tcpip \Parameters]
MaxFreeTcbs = 2000 (Default = RAM dependent, but usual Pro = 1000, Srv=2000)
To make lookups in the TCB table faster a hash table has been made, which is optimized for finding a certain active connection. If the hash table is too small compared to the total amount of active connections, then extra CPU time is required to find a connection. Configure the size of the hash table with this DWORD registry key (Is allocated from pagepool memory): [HKEY_LOCAL_MACHINE \System \CurrentControlSet \services \Tcpip \Parameters]
MaxHashTableSize = 512 (Default = 512, Range = 64-65536)
Also, we recommend to change MinFreeConnections and MaxFreeConnections values [HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanServer\Parameters] according to this article:
http://support.microsoft.com/kb/909262
=====================================
Also, please ensure your machine is not infected by W32/QAZ trojan horse. Error 10055 can be caused by W32/QAZ trojan as well: http://home.mcafee.com/VirusInfo/VirusP ... ?key=98775
Regards,
Max
-
- Posts: 25
- Joined: Fri Apr 18, 2008 5:17 am
- Contact:
-
- Posts: 2832
- Joined: Tue May 16, 2006 4:41 am
- Contact:
I think, you may create a "View" to filter "Ping" tests using "Select items by test method" option. After that you may select all the tests in a view and click on "Edit" button on toolbar. In appeared "Test Properties" window you should disable "Trea Unknown status as Bad" option and click "Ok".Wolfgang Bach wrote:How can i mass-change my 200 Ping-Tests not to alert on Status=unknown?
Regards,
Max
-
- Posts: 25
- Joined: Fri Apr 18, 2008 5:17 am
- Contact: