RMA: 301 - Cannot retrieve data

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
Post Reply
SplanK
Posts: 38
Joined: Wed Nov 21, 2007 1:33 pm

RMA: 301 - Cannot retrieve data

Post by SplanK »

Hello,
For weeks a CPU test has been working fine on 2x Windows 2008 r2 servers. These servers are doing the same job (RDP gateway server), the severs are split into their own DMZs and the Hostmonitor has to cross a firewall (2 different firewalls, again one at each site). The LAN has unrestricted access into these DMZ's but will allow traffic to pass back.

In the last few days I have noticed that, out of 17 tests, there is one test (CPU Usage) that's repeatedly flipping between "RMA: 301 - Cannot retrieve data" and returning a result. It is the only test that seems to do it, but is happening on these 2x RDP gateway servers. We have other VLANs with similar restrictive access setups with no issues.

The test is set to recheck every 5 minutes, and on "bad" change to every 30 seconds. Again a lot of other tests are like this with no problem

Thanks
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

Sounds like timeout related issue.
If you are using passive RMA, try to increase timeout intervals on RMA and HostMonitor sides.
SplanK
Posts: 38
Joined: Wed Nov 21, 2007 1:33 pm

Post by SplanK »

Thanks for your quick reply.

I have changed the time out of the RMA app on the target side to 120, and using RMA manager, set the server side to 240. Still the same.

There does not appear to be a time out alteration for the CPU test unless I am looking in the wrong place?
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

What HostMonitor and RMA agent versions do you use?
Could you please check RMA agent logs? Any errors?
SplanK
Posts: 38
Joined: Wed Nov 21, 2007 1:33 pm

Post by SplanK »

Host Mon 9.90 / RMA 4.11

No error log generated on agent.
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

HostMonitor 9.90 comes with RMA ver. 4.88.
Please use all components (HostMonitor, RMA, RCC etc.) from one installation package.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

4.11? May be you checked version of rma_cfg.exe utility?

Regards
Alex
SplanK
Posts: 38
Joined: Wed Nov 21, 2007 1:33 pm

Post by SplanK »

aah, yes sorry! rma_cfg is 4.11

The agent is 4.88 as deployed by the 9.90 install file.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

>There does not appear to be a time out alteration for the CPU test unless I am looking in the wrong place?

It depends on Windows RPC timeouts, 2 min by default I think.
"RMA: 301 - Cannot retrieve data" means HostMonitor is able to connect to agent but agent cannot retrieve information from target host within specified timeout. So problem should be related to target host (system too busy or out of resources?) or connection between RMA and target host.

Regards
Alex
SplanK
Posts: 38
Joined: Wed Nov 21, 2007 1:33 pm

Post by SplanK »

These 2 severs are very under utilised, these are also virtual machines which sit on a host which has an awful lot of spare headroom. In fact, these virtual guests and hosts spend most of their life idle (seems pointless having them to be honest!).

It is plausible that it could be the connection to the remove host however this is where my curveball comes in.

Looking at the logs, there does not seem to be any sort of pattern to the flip flop behaviour.

1. These 2 machines are in their own DMZ. Their settings on the firewall are the same as another DMZ and I am happy to poll another Server 2008R2 server no problem across the same firewalls.

My firewall logs suggests there is nothing being blocked or over saturated ( at most, 20% load)

2. There are other tests (2 Active scripts, memory tests, hard drive space, service tests and TCP poll checks). All of which have been fine on both servers, it is just the CPU one that flip flops between RMA 301 and returning a result.

3. When I refresh the test, it does not take 2 minutes to time out, its more 2-5 seconds.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

There are other tests (2 Active scripts, memory tests, hard drive space, service tests and TCP poll checks). All of which have been fine on both servers, it is just the CPU one that flip flops between RMA 301 and returning a result.
Yes, this means network works fine, RPC service available, atc.
Then it sounds like problem relates to Performance Counters DLLs. But as we know it works fine on Windows 2008 R2 (or does not work at all if somebody disabled some counters)...
These 2 severs are very under utilised, these are also virtual machines which sit on a host which has an awful lot of spare headroom. In fact, these virtual guests and hosts spend most of their life idle (seems pointless having them to be honest!).
Could you check handles, threads usage on these systems?

Regards
Alex
Post Reply