KS-Soft. Network Management Solutions
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister    ProfileProfile    Log inLog in 

RMA: Connection error

 
Post new topic   Reply to topic    KS-Soft Forum Index -> Configuration, Maintenance, Troubleshooting
View previous topic :: View next topic  
Author Message
Ross



Joined: 27 Aug 2012
Posts: 4

PostPosted: Mon Aug 27, 2012 9:45 pm    Post subject: RMA: Connection error Reply with quote

Hello everyone

I am new to this forum.

Been using Hostmon for a while now.

Some reason started getting Connection Error and this thing floods my mailbox like crazy.

some test are good and they all worked fine but for some reason it started this email, i get alert bad and then alert good.

can someone help me with this please.
Back to top
View user's profile Send private message
KS-Soft Europe



Joined: 16 May 2006
Posts: 2832

PostPosted: Tue Aug 28, 2012 6:45 am    Post subject: Reply with quote

Actually this is main purpose of HostMonitor - send alerts when something goes wrong.
Could you please give some details:
- What HostMonitor version do you use?
- Is it Started as Application or Service?
- RMA version?
- What kind of RMA? RMA for Windows? Active RMA for Windows? RMA for Linux?
- What is Status and Reply field of these tests when problem occurs ?
- Is there other test methods for the same server that work without triggering alerts ?
Back to top
View user's profile Send private message Send e-mail Visit poster's website
KS-Soft



Joined: 03 Apr 2002
Posts: 12795
Location: USA

PostPosted: Tue Aug 28, 2012 9:27 am    Post subject: Reply with quote

1) When you have unreliable connection between HostMonitor and RMA, its good idea to check where exactly is the problem. E.g. you may start with Trace utility or Trace test method. Check timeout specified for this agent, etc. If timeout too short, increase it. If some router does not work properly, replace it...

2) If you cannot fix network problem, you may change HostMontor settings. E.g. you may tell HostMonitor do not start actions for "Unknown" test status.
- unmark "Treat Unknown as Bad" test option
- mark "Action depends on Bad one" option for "good" actions assigned to the test items.
Also you may use "Repeat test" action using "advanced mode" action and expression like ('%Status%'=='Unknown') and (%Recurrences%==1)
For more information please check the manual or visit our web site at
http://www.ks-soft.net/hostmon.eng/mframe.htm#actions.htm#advancedaction
http://www.ks-soft.net/hostmon.eng/mframe.htm#actions.htm#actRepeat

3) In any case when you have many tests that depends on some common application (e.g. RMA) or single network connection, its good idea to setup Master test.
E.g. setup 1 test to check connection to the agent (Ping localhost or Ping rma itself) and make other RMA tests dependant on this test item.
For more information please check the manual or visit our web site at
http://www.ks-soft.net/hostmon.eng/rma-win/index.htm#Settings - check "How to use" section near end of the page.
http://www.ks-soft.net/hostmon.eng/mframe.htm#tests.htm#Master

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
Ross



Joined: 27 Aug 2012
Posts: 4

PostPosted: Tue Aug 28, 2012 5:28 pm    Post subject: Reply with quote

Hi Alex

thank you for the reply.

The situation is that we use Hostmon to get alerts from our client servers.
Everything was working great, we would get alert bad when things such as server restarts happen or anything we basically configured hosmon to do.
Alert Good would come on once server would come on.

We migrated our hostmon server to a datacenter and it was working no different for around 2 weeks.

suddenly started getting crazy amount of emails which tells us that servers are down, services are stoped so on.
none of that is correct since everything is running fine on client end.
we cannot trust Hostmon untill that is fixed.

I upgraded hostmon on 1 client to the latest version and on our server to the latest version.

last night still got around 300 emails, alert bad then alert good.
Connection between our data center and client is definitely reliable, people work remotely on server and never get drop outs.

Our test are setup like you are describing, we have Master Agent that pings and all test depend on that, so if its not working other test tell us Waiting for Master.

All agents are passive and all run on windows environment.

We have setups such as 5 bad or 2 good and it sends emails.

Biggest problem is, for example one of my client has like 30 tests, they show disk space, services running and certain other things.

When i look at the tests they are pretty much all green so everything is working fine.

after certain amount of time, some tests start to show "RMA:Connection error".
if i refresh the test that has RMA error on it, it goes back to being green.
Sometimes it tells me that Connection was forcibly closed by other side.

this is why we keep getting emails, all of our emails have RMA Connection error.

Here is an example

Reply : RMA: Connection error,
Status : Unknown ( 2 times in a row ),
Time of Event (from log) : %NTEventTime%

------------------------------------
HostMonitor (test changed status)

Test : XXXXX
Folder : >\XXXXXXX\Servers\
Status : Unknown ( 2 times in a row )

Date : 2012-08-29 9:32:33 AM
Time of Event (from log) : %NTEventTime%

Method: ping (timeout - 2000 ms)
Reply : RMA: Connection error


Last status: Unknown
Total tests: 370768
Alive ratio : 93.68 %
Dead ratio: 0.04 %


Regards

Ross
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12795
Location: USA

PostPosted: Wed Aug 29, 2012 4:57 am    Post subject: Reply with quote

Quote:
suddenly started getting crazy amount of emails which tells us that servers are down, services are stoped so on.
Here is an example
Reply : RMA: Connection error,
Status : Unknown ( 2 times in a row )

This status does not mean server are down or service is stopped. This status means HostMonitor cannot connect to RMA.

Quote:
I upgraded hostmon on 1 client to the latest version and on our server to the latest version.

Do you mean you updated RMA on 1 client to the latest version?
What about other clients? Some old RMA installed there?
Please use all components (RMA, RCC, HostMonitor) from the same package.

Quote:
last night still got around 300 emails, alert bad then alert good. Our test are setup like you are describing, we have Master Agent that pings and all test depend on that, so if its not working other test tell us Waiting for Master.

If "other test tell us Waiting for Master" then why did you get 300 e-mails?
Are you using 100 or more RMA and all of them returned "Unknown" status at the same time?
What value have you set for "Consider status of the master test obsolete after N seconds" option (HostMonitor Options dialog -> Behavior page)?

Quote:
We have setups such as 5 bad or 2 good and it sends emails.

2) If you cannot fix network problem, you may change HostMontor settings. E.g. you may tell HostMonitor do not start actions for "Unknown" test status.
- unmark "Treat Unknown as Bad" test option
- mark "Action depends on Bad one" option for "good" actions assigned to the test items.
Also you may use "Repeat test" action using "advanced mode" action and expression like ('%Status%'=='Unknown') and (%Recurrences%==1)
For more information please check the manual or visit our web site at
http://www.ks-soft.net/hostmon.eng/mframe.htm#actions.htm#advancedaction
http://www.ks-soft.net/hostmon.eng/mframe.htm#actions.htm#actRepeat

Quote:
Sometimes it tells me that Connection was forcibly closed by other side.

Sounds like antivirus or firewall issue...
What Windows do you use on HostMonitor and RMA system? Service Pack? Firewall? Antivirus?

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
Ross



Joined: 27 Aug 2012
Posts: 4

PostPosted: Wed Aug 29, 2012 5:28 pm    Post subject: Reply with quote

Hi Alex

thank again for a quick reply.

RMA Connection Error is the problem we started to experience, in reality nothing changed.

Looking at all the tests i can see them working fine but around 50 times a day i get this email sent to us saying the RMA error.
When i look at the test it says the same and if i manually refresh it, it goes back to normal and then the same RMA error comes up a bit later.

The only time we would get RMA Connection Error is when ISP or Server is really down and we wouldn't get Alert Good until the agent is back online.

it never happened before and nothing at our clients have changed.
If anything our connection is 100Megabit up and down and thats a lot faster than we previously had when everything worked fine.

The 1 client i mentioned was done for testing purposes, i tried to isolate problem such as using different version of the package and recreated some tests but i still keep getting RMA ERROR.

this problem literally happens to every one of our clients.

i am guessing you are correct about connection issue that Server cannot contact RMA agent and it puts out this error.

but how can i fix it, there is no issues with our connection, its all reliable and if i go to RMA Manager i can easily GET INFO on any of the agents which tells me that they are Operational
Regards

Ross
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12795
Location: USA

PostPosted: Thu Aug 30, 2012 5:48 am    Post subject: Reply with quote

Quote:
but how can i fix it, there is no issues with our connection, its all reliable and if i go to RMA Manager i can easily GET INFO on any of the agents which tells me that they are Operational

Have you done anything we said?
- have you tried to find network problem using trace to collect statistics?

- have you tried to disable firewall, antivirus?
we still don't know what firewall and antivirus do you use, we don't know what Windows and Service Pack is installed on local and remote systems... could you please answer to our questions?

- have you checked and adjusted HostMonitor settings? have you checked timeout specified for the agents, have you checked Master/Dependant settings? Have you adjusted action settings?

Quote:
its all reliable and if i go to RMA Manager i can easily GET INFO on any of the agents which tells me that they are Operational

Do you really think if you can establish 1 TCP connection this means network is reliable? May be some hardware or software can handle 500 IP packets per second but cannot handle 1500 packets per second?

Regards
Alex


Last edited by KS-Soft on Thu Aug 30, 2012 6:01 am; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
KS-Soft



Joined: 03 Apr 2002
Posts: 12795
Location: USA

PostPosted: Thu Aug 30, 2012 6:00 am    Post subject: Reply with quote

PS
If you cannot fix network, then in addition to other solutions (see my previous mails), there are 2 more options:
- add Backup RMA
- use Active RMA instead of Passive RMA.

Quote from the manual
================
Several procedures help to monitor networks over unreliable connections:

- Active RMA may store test results when network connection unexpectedly brakes off, it will try to reconnect to HostMonitor and send test results upon connection;

- HostMonitor uses more flexible schedule for the tests that should be performed by Active RMA, e.g. if test has to be performed immediately but Active RMA is not connected while it was connected several minutes ago, HostMonitor may wait for another connection up to 4 min before assigning Unknown status to the tests. Note: if you select the test item and click Refresh button, HostMonitor will not wait for connection, it will set Unknown status right away;

- For each active agent you may setup Backup Active RMA. Using this feature, HostMonitor is able to balance load between agents and use the backup agent when the primary one cannot establish communication channel.
================

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
Ross



Joined: 27 Aug 2012
Posts: 4

PostPosted: Thu Aug 30, 2012 6:01 am    Post subject: Reply with quote

Hi Alex

We have W2k8 r2 on our end and either w2k or 2k8 on client side.
All patched with latest updates.

We use TMG and clients use windows plus hardware firewall, nothing fancy thou.

All ports are opend up as required.

We have around 3000 tests for various clients.

There is definetly no network issues due to us actually hosting a virtual server for client in our data center so there is no need for internet in this case to report back and we are still getting connection error using same virtual switch between us.

We been using host mon for 5 years or so and only now having this rma connection problem.

I triple checked all agents and their depends and cannot find anything wrong.

For example i got 100 tests for 1 client, most tests show correct result such as ping, disk space and other stuff.

Thoes same tests for no reason display connection error and than next time it refreshes it shows correct result.
This keeps happening for all of the clients.

I did try a few of the things you mentioned but it only stops notification, it does not fix false result for the test.

Any other ideas?

Thanks.
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12795
Location: USA

PostPosted: Thu Aug 30, 2012 8:05 am    Post subject: Reply with quote

Quote:
Any other ideas?

Please read my previous answers:
- check timeout
- use backup RMA
- use Active RMA
- etc.
Have you checked timeout? I asked this 3 times. What exactly timeout have you set?
Have you setup Backup RMA or Active RMA?

Quote:
We been using host mon for 5 years or so and only now having this rma connection problem.

Well, something was changed. What exactly was changed? Firmware update for firewall? TMG update? Increased network traffic? Increased number of tests?
May be somebody changed HostMonitor timeouts?

Quote:
We have W2k8 r2 on our end and either w2k or 2k8 on client side.

Windows 2000 Professional or Windows 2000 Server?

Quote:
it does not fix false result for the test.

Why not? Backup RMA and/or Active RMA can fix this.

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
pato



Joined: 25 Nov 2015
Posts: 2

PostPosted: Wed Nov 25, 2015 6:39 am    Post subject: Reply with quote

Hi, first post here.

I know that this is an old thread but this issue drive me crazy until I figure out where was the problem.

Nobody mention to change the f*king PORT for communication of RMA. I have this problem in a APP server, using port 1055 for default. I think that the 1055 port was in use by another app (AV ok, fw disable, etc). So, I change the port to 1056 and everything works perfect.



bye.

Pato.-
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 12795
Location: USA

PostPosted: Wed Nov 25, 2015 7:18 am    Post subject: Reply with quote

When HostMonitor cannot open port for listening, it records error in the log.
Also, there is Auditing Tool (menu View->Auditing Tool), its good to use it from time to time even if everything looks fine.

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
pato



Joined: 25 Nov 2015
Posts: 2

PostPosted: Wed Nov 25, 2015 8:26 am    Post subject: Reply with quote

KS-Soft wrote:
When HostMonitor cannot open port for listening, it records error in the log.
Also, there is Auditing Tool (menu View->Auditing Tool), its good to use it from time to time even if everything looks fine.

Regards
Alex


Thanks for the enlighten ! I am working in IT for the past 10 years and I still forgetting to look some logs LOL!

[24/08/2015 08:06 p.m.] RMAService Error: cannot open port #1055. Probably port in use
[02/09/2015 08:07 p.m.] RMAService Error: cannot open port #1055. Probably port in use
[29/09/2015 09:10 p.m.] RMAService Error: cannot open port #1055. Probably port in use
[30/09/2015 08:06 p.m.] RMAService Error: cannot open port #1055. Probably port in use
[24/11/2015 08:09 p.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:44 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:46 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:47 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:47 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:48 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:53 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:54 a.m.] RMAService Error: cannot open port #8080. Probably port in use
[25/11/2015 08:54 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 08:59 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 09:24 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 09:24 a.m.] RMAService Error: cannot open port #1055. Probably port in use
[25/11/2015 09:25 a.m.] RMAService Error: cannot open port #1055. Probably port in use
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    KS-Soft Forum Index -> Configuration, Maintenance, Troubleshooting All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

KS-Soft Forum Index