KS-Soft. Network Management Solutions
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister    ProfileProfile    Log inLog in 

Crash Issue in 11.00 - Related to ActiveRMA

 
Post new topic   Reply to topic    KS-Soft Forum Index -> Configuration, Maintenance, Troubleshooting
View previous topic :: View next topic  
Author Message
eddymicro



Joined: 13 Nov 2002
Posts: 85

PostPosted: Mon Jan 08, 2018 5:27 pm    Post subject: Crash Issue in 11.00 - Related to ActiveRMA Reply with quote

Hello,

After upgrading to 11.00 Hostmonitor has been running fine for one week. Today the program hung and after that we could not restart it. We finally determined that if we remove the the ActiveRMA section from hostmon.ini. The program works OK (without any ActiveRMA Agents). If we put this section back we have to use task manager to kill the program, it does not work at all. Can you suggest how we can fix this so we can use our Active RMA agents?


Section We removed to allow program to work
[ActiveRMAServer]
Enabled=1
Port=5056
Timeout=60
UpdateTimeout=60
TestTimeout=240
AcceptAnyIP=1
AcceptedList=
IPMarks=
LogAccepted=0
LogRejected=1

Hang Message from Event Log

The program hostmon.exe version 11.0.0.1583 stopped interacting with Windows and was closed. To see if more information about the problem is available, check the problem history in the Action Center control panel.
Process ID: 289c
Start Time: 01d388afae7dd1a3
Termination Time: 4294967295
Application Path: C:\Program Files (x86)\HostMonitor\hostmon.exe
Report Id: 56766995-f4a3-11e7-811f-0cc47a40aa2a
Faulting package full name:
Faulting package-relative application ID:
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 11882
Location: USA

PostPosted: Mon Jan 08, 2018 5:37 pm    Post subject: Reply with quote

We cannot reproduce the problem.
Windows?
Service Pack?
Antivirus monitor?
Active RMA version? 6.00?
RCC and/or Web Service still work fine?
Does RMA Manager works fine and gets connections from Active RMA agents?
Have you started Auditing Tool, any errors?
Any errors in HM system log file (syslog.htm by default)?

May be some other software (beside RMA) trying to connect to port 5056? It should not be a problem but could you try to change port number on HostMonitor, then on agents?

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
eddymicro



Joined: 13 Nov 2002
Posts: 85

PostPosted: Mon Jan 08, 2018 8:40 pm    Post subject: Reply with quote

Windows 2012 R2 Standard Xeon e5-2650 128 GB Ram. We have been running Hostmonitor on this Server for more than 2 years without any issues. RMA Manager does connect fine ( we have over 500 Active Agents) Running 6.00.

We changed no settings at all before or after the crash, and the 500 agents have been running for for a very long time. So I do not understand why we see all the socket errors.

Thanks for suggesting to look at the syslog, it has clues in it, the example below seem to be repeating for all of the active RMA agents. The sequence below show exactly what happened when we started Hostmonitor after the crash. I want to point out that we rebooted the server a few times and disabled the watchdog, the webinterface and none of that helped.


1/8/2018 1:37:08 PM Monitor started
1/8/2018 1:37:08 PM RCI enabled
1/8/2018 1:37:08 PM RMA Server enabled
1/8/2018 1:37:13 PM RCI connection established. Operator: jbassig IP: 192.1.1.26
1/8/2018 1:37:17 PM 192.1.2.111: Invalid request packet received (wrong password?)
1/8/2018 1:37:17 PM WatchDog connection closed (Connected at 1/8/2018 1:37:17 PM).
1/8/2018 1:37:19 PM RMA Error: Action "Start service" was not executed because agent "PrintserverV01" was not connected
1/8/2018 1:37:24 PM Action error: request sent, server reply "OK 76533327 76533328 76533329 76533330 76533331 76533332 76533333 76533334 76533335 76533336 7653333" (Action "HTTP request", Test "PrintserverV02 Process wdRun (Watch Dire.." [15951])
1/8/2018 1:37:36 PM 10.1.0.103: Connection established (RMA StorageX04) Windows socket error: An established connection was aborted by the software in your host machine (10053), on API 'send'
1/8/2018 1:37:40 PM 192.1.2.129: Connection established (RMA CallCenter4) Windows socket error: An established connection was aborted by the software in your host machine (10053), on API 'send'
1/8/2018 1:38:30 PM 10.1.0.69: Connection established (RMA VServerX08) Windows socket error: An established connection was aborted by the software in your host machine (10053), on API 'send'
Back to top
View user's profile Send private message
eddymicro



Joined: 13 Nov 2002
Posts: 85

PostPosted: Mon Jan 08, 2018 8:52 pm    Post subject: Reply with quote

Alex

I can reproduce the problem, I just change the Enabled=0 to Enabled=1 in the ActiveRMA section. Then I manually start Hostmonitor and I get the same values in the syslog. I though maybe we tried to start hostmonitor twice, but I am sure that is not the cause because I can easily duplicate the issue. As I said before we did not change anything. Here is an example from a few minutes ago

1/8/2018 1:37:08 PM Monitor started
1/8/2018 1:37:08 PM RCI enabled
1/8/2018 1:37:08 PM RMA Server enabled
1/8/2018 1:37:13 PM RCI connection established. Operator: jbassig IP: 192.1.1.26
1/8/2018 1:37:17 PM 192.1.2.111: Invalid request packet received (wrong password?)
1/8/2018 1:37:17 PM WatchDog connection closed (Connected at 1/8/2018 1:37:17 PM).
1/8/2018 1:37:19 PM RMA Error: Action "Start service" was not executed because agent "PrintserverV01" was not connected
1/8/2018 1:37:24 PM Action error: request sent, server reply "OK 76533327 76533328 76533329 76533330 76533331 76533332 76533333 76533334 76533335 76533336 7653333" (Action "HTTP request", Test "PrintserverV02 Process wdRun (Watch Dire.." [15951])
1/8/2018 1:37:36 PM 10.1.0.103: Connection established (RMA StorageX04) Windows socket error: An established connection was aborted by the software in your host machine (10053), on API 'send'
1/8/2018 1:37:40 PM 192.1.2.129: Connection established (RMA CallCenter4) Windows socket error: An established connection was aborted by the software in your host machine (10053), on API 'send'
1/8/2018 1:38:30 PM 10.1.0.69: Connection established (RMA VServerX08) Windows socket error: An established connection was aborted by the software in your host machine (10053), on API 'send'
Back to top
View user's profile Send private message
eddymicro



Joined: 13 Nov 2002
Posts: 85

PostPosted: Mon Jan 08, 2018 9:07 pm    Post subject: Reply with quote

I changed the port for accepting active RMA and it seems to have worked. I ran a port scanner and it does not show anything on 5056 which is what we have used for many many years. So I do not understand why the hang of hostmonitor caused this chain of events and now the default port no longer works . I am going to try to move all the agents to a different port right now and see if that totally solves the problem. Thanks
Back to top
View user's profile Send private message
eddymicro



Joined: 13 Nov 2002
Posts: 85

PostPosted: Mon Jan 08, 2018 9:11 pm    Post subject: Reply with quote

Alex

It will not let me update the port for a group selection because I have a weak password. I do not want to update 500 agents one at time is there a way to bypass this setting
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 11882
Location: USA

PostPosted: Tue Jan 09, 2018 5:10 am    Post subject: Reply with quote

eddymicro wrote:
It will not let me update the port for a group selection because I have a weak password. I do not want to update 500 agents one at time is there a way to bypass this setting

1) So why you do not want to set strong password? RMA Manager can do that for set of agents, so you may set good password and change port at the same time.
I would suggest to set new port (and password) to set of RMA agents but not to all of them at once. May be problem caused by some "bad" agent. If you apply new settings to 50 RMA agents, then to another 50 RMA, then to another set... and then HostMonitor crashes, then probably problem relates to last set of these 50 RMAs.
(if you like week passwords, you may tell RMA Manager to allow this using StrongPswd=0 parameter, please check What's New section for details)

2) I think its better to use e-mail for such conversations (support@ks-soft.net). Now everybody knows you are using weak passwords and may be somebody checking your old posts, trying to find your HostMonitor system IP and will try to brake into your system using brute force on weak RCC password. I hope HostMonitor system protected by firewall and HostMonitor settings do not allow RCC connections from any IP?

3) Could you please send HM system log to support@ks-soft.net (or better send all records related to last couple days)?

4) >I ran a port scanner and it does not show anything on 5056 which is what we have used for many many years.
Port scanner? Scanning HostMonitor system? I think this does not make sense. Only HostMonitor used this port on HostMonitor system.
But may be some remote system (with unknown IP) trying to connect to this port on HostMonitor system. In such case you should see some data in HM system log file.

5) >So I do not understand why the hang of hostmonitor caused this chain of events
Sorry, I do not understand. What "chain of events"?
I thought some event causes hang. While hang does not cause anything else.
If I am wrong, please explain what exactly happens (please use e-mail support@ks-soft.net)

Regards
Alex


Last edited by KS-Soft on Tue Jan 09, 2018 8:39 am; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
KS-Soft



Joined: 03 Apr 2002
Posts: 11882
Location: USA

PostPosted: Tue Jan 09, 2018 5:17 am    Post subject: Reply with quote

PS
May be crash caused not by RMA connection, may be it caused by some test performed by agent or some action triggered by some test performed by some agent.
So, may be you don't need to change port. You may try to stop monitoring and disable alerts (menu Monitoring), set back Active RMA port to 5056, wait while agent connects to HostMonitor.
Then you may try to start monitoring. If everything will work fine, then try to enable actions.
Or better if you try to Pause all test items, then resume tests in set of 100 (just for example). This way we may understand what test item leads to the problem.

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
eddymicro



Joined: 13 Nov 2002
Posts: 85

PostPosted: Tue Jan 09, 2018 7:02 am    Post subject: Reply with quote

Alex these are all great suggestions. We will follow them including setting a strong password and send you an email with an update. Thanks.
Back to top
View user's profile Send private message
KS-Soft



Joined: 03 Apr 2002
Posts: 11882
Location: USA

PostPosted: Fri Jan 19, 2018 12:30 pm    Post subject: Reply with quote

btw: problem fixed
If somebody else has the same problem, update available by request.
And probably we upload new version next week

Regards
Alex
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    KS-Soft Forum Index -> Configuration, Maintenance, Troubleshooting All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group

KS-Soft Forum Index