HowTo run 2 stand-alone, synchronised instances of HM

All questions related to installations, configurations and maintenance of Advanced Host Monitor (including additional tools such as RMA for Windows, RMA Manager, Web Servie, RCC).
Post Reply
ataudte
Posts: 28
Joined: Wed Apr 23, 2008 3:40 am
Location: FFM, Germany

HowTo run 2 stand-alone, synchronised instances of HM

Post by ataudte »

Hello KS-Soft Team,

I read across the forum and searched for a lot of concepts, but I don't find an acceptable workaround for my problem.

I need 2 instances of the HostMonitor. One primary HM, which monitors my network and sends alerts, and an other (secondary) HM, which looks after the primary ones. If my primary installation of HM goes down, the backup HM accepts the work of monitoring and if my primary HM is running again, the secondary HM "goes to sleep".

That's not an issue. I have to install 2 instances of the HM on 2 various systems and furthermore I have to define one process test in every HM to monitor the other. Thus both run interdependently.

Now the problem for this solution:
I have to insert every test into the primary AND the secondary HM. At the moment there are almost 300 tests running. Ok, I could copy all *.hml, *.lst and *.ini files from one HM's folder to the other, but this only function properly, if the other ones isn't running. What's more, some parameters inside of the *ini files disagree. I also tested the Replicator, but this tool doesn't help my much.

I need a solution to set up 2 instances of the HM, that synchronise each other. Thus I insert a new test into my primary installation and the backup HM inserts this into its hierarchic structure.

The only one that could work, is to export all Tests (txt file) and import with this text file. But this doesn't work, if there are sub folders in the tree, that weren't created in the tree of the other HM. I had to write a script, that looks for the folders in the txt file and insert the "CreateFolder" command for every folder found in the file at the beginning of the txt file. Thus the imported file creates the hierarchic structure and inserts the tests.

Ok, it works, but I'm looking for a solid solution.

Any idea!?

Thank you for your efforts!

ataudte
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

Probably, you may use following approach. You should setup 2nd instance of HostMonitor to monitor primary one. E.g. you may use TCP test to check RCI port (1054) or you may use "File Availability" test method to monitor reports or log files. When failure detected (e.g. primary HostMonitor does nor answer on RCI port), 2nd instance may send e-mail to admin and start HMS Script with single action (LoadTestList filename) to load primary list of tests. To make it work properly, you should select "Auto save TestList after any changes" option from "Autosave options" dropdown in "Preferences" tab of "Options" dialog for the first instance. After that you may use built-in "Scheduler" on the first HostMontior's instance to copy current test list to some network resource, accessible by second instance, on regular basis. In case of failure, second instance will be able to load fresh version of the tests list.

Additionally, you may load the test or tests, that checks when first instance cane up. When second instance detects the first instance is running, it should load initial tests list.

Regards,
Max
ataudte
Posts: 28
Joined: Wed Apr 23, 2008 3:40 am
Location: FFM, Germany

Post by ataudte »

Good Morning Max (we got it a quarter to 8 am),

at first, great thanks you for your reply.

Until I wrote my topic, I also tested this variant. It is really simple to define a test, that starts the HMS script, loads the *.hml file and begins to do the job of the primary HM.
But there is still the problem with the synchronization. It is easy to transfer the test of the primary HM to the secondary one (via Export / Import or via LoadTestList), but what about the configuration? If a test uses a scheduler, which one is not defined in the backup HM, the test never runs. To copy the relevant HM files (*.ini, *.lst, etc.) is not a solution, because the HM needs these files on start up to know the RMAs, schedules and so on.

I got one alternative:
- define a test at the primary HM to copy the relevant files to the backup system
- place the *.hml file at a reachable (for both system) directory
- define a test at the secondary HM to look after the primary one and if this test becomes "Bad" starting a script that "LoadTestList" and execute a script to restart secondary HM
The backup HM should start with the last known *.hml file, which is located somewhere in the net and load all configurations and other lists (RMA, schedules, etc.), which was copied from primary HM's folder to backup HM's folder.

There is a need for clarification. I have to speak with the administration, which solution we aspire.

I was just looking for a tool or script to synchronize 2 stand-alone instances of HM, but this seems to be a topic for the "Wish List" ;)

Thank you all the same!

ataudte
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

We plan to implemend some options for cluster monitoring. May be it can be done in version 8, may be 9 :roll:

Regards
Alex
ataudte
Posts: 28
Joined: Wed Apr 23, 2008 3:40 am
Location: FFM, Germany

Post by ataudte »

Hello everyone!

I wish you great success with this project.

Here is my solution to solve this problem:

### At the backup system ###
Install HM the same way like at the primary system (as service!!!), but do not start the HostMon service.
Create 2 executable scripts. The first one to look after the primary HM and the second one to get all relevant files from the primary HM.
AutoIt is a nice scripting language (look at http://www.autoitscript.com/). It is free, easy to learn and got a great development environment.

Here is the AutoIt script to check HostMon service at the primary system. This script starts the backup HM, if the other system is down, or the HostMon service is not running:

Code: Select all

; ###########################################################################
; ### Script to monitor the primary HostMonitor                           ###
; ### and to start the backup HostMonitor on error                        ###
; ###                                                                     ###
; ### pillow-laced by ATA                                                 ###
; ###########################################################################

#include "ServiceControl.au3"

; ###########################################################################
; Some Variables
; ###########################################################################
$pc          = "computer"
$user        = "Administrator"
$pw          = "password"
; Name of Service is case censitive
$remote_srv  = "HostMonService"    ; monitored Service on Remote System
$srv         = "HostMonService"    ; local Service on error
; Time
$wait        = "30000" ; milliseconds
                      ; look here to convert values:
                      ; http://www.convertworld.com/de/zeit/Millisekunden.html

; ###########################################################################
; The Program
; ###########################################################################

While 1
	ProgressOn("Checking Primary HostMonitor", "Please wait...", "Starting...", 50, 50, 16)
	Sleep(1000)
	; -----------------------------------------------------------------------
	; Timestamp & Temp File
	; -----------------------------------------------------------------------
	$mydate      = @YEAR & @MON & @MDAY
    $mytime      = @HOUR & @MIN & @SEC
    $mytimestamp = $mydate & "-" & $mytime
    $tmp_conn    = "conntmp_" & $mydate & $mytime & ".txt" 
	$tmp_srv     = "srvtmp_" & $mydate & $mytime & ".txt" 
	; -----------------------------------------------------------------------
	; Logging
	; -----------------------------------------------------------------------
    ProgressSet(10,"Start Logging...")
	Sleep(1000)
	$logfolder   = "srv_logs\"
    DirCreate($logfolder)
    $logfile     = $logfolder & "srvlog-" & $mydate & ".txt"
    RunWait(@ComSpec & " /c " & "echo. >>" & $logfile, "", @SW_HIDE)
	RunWait(@ComSpec & " /c " & "echo LOGS - " & $mytime & " >>" & $logfile, "", @SW_HIDE)
	RunWait(@ComSpec & " /c " & "echo ------------- >>" & $logfile, "", @SW_HIDE)
	; -----------------------------------------------------------------------
	; Test Connection
	; -----------------------------------------------------------------------
	ProgressSet(20,"Checking Connection...")
	Sleep(1000)
	If not chkConnection($pc) Then
		; Connection to System failed
		myError("Host " & $pc & " unreachable.", "Connection Error!")
		startBackupSys($srv)
	Else
		; -------------------------------------------------------------------
		; Connect to remote IPC Share
		; -------------------------------------------------------------------
		ProgressSet(30,"Connect to remote system...")
		Sleep(1000)
		; Sending Login
		RunWait(@ComSpec & " /c " & "net use \\" & $pc & "\IPC$" & _ 
		        " /user:" & $pc & "\" & $user & " " & _ 
			    $pw & " >" & $tmp_conn, "", @SW_HIDE)
		$file_conn = FileOpen($tmp_conn, 0)
		If $file_conn = -1 Then
			; Error with the Temp File
	     	myError("No File: " & $tmp_conn, "Unable to open Temp File!")
		Else
			$line_conn = FileReadLine($tmp_conn, 1)
			If not $line_conn = "The command completed successfully." Then
				myError("Connection to " & $pc & _ 
						" failed!", "Connection failed!")
			Else
				ProgressSet(40,"Connection established...")
		        Sleep(1000)
				; -----------------------------------------------------------
				; Query service for status
				; -----------------------------------------------------------
				ProgressSet(50,"Sending Query...")
				Sleep(1000)
				RunWait(@ComSpec & " /c " & "sc \\" & $pc & " query " & _ 
						$remote_srv & " > " & $tmp_srv, "", @SW_HIDE) 
				; -----------------------------------------------------------
				; Analyse Temp File
				; -----------------------------------------------------------
				ProgressSet(60,"Analysing Query...")
				Sleep(1000)
				$file_srv = FileOpen($tmp_srv, 0)
				If $file_srv = -1 Then
					; Error with the Temp File
					myError("No File: " & $tmp_srv, "Unable to open Temp File!")
				Else
					$line_srv = FileReadLine($file_srv, 4)
					If $line_srv = "        STATE              : 2  START_PENDING " Then
						; Go to sleep and try again
						myError("Srv Starting: " & $remote_srv, $remote_srv & _ 
								"at " & $pc & " starting")
						; Exit
					ElseIf $line_srv = "        STATE              : 4  RUNNING " Then
						; Everything is okay, go to sleep
						ProgressSet(70,"Primary Service is running...")
						Sleep(1000)
						stopBackupSys($srv)
						;Exit
					ElseIf $line_srv = "        STATE              : 1  STOPPED " Then
						; Start the Backup System
						myError("Srv Stopped: " & $remote_srv, $remote_srv & _ 
								"at " & $pc & " is stopped")
						startBackupSys($srv)
						;Exit
					Else
						; Absolutly necessary to start the Backup System
						myError("Srv Error: " & $remote_srv, "Error with " & _ 
								$remote_srv & @CRLF & "at " & $pc)
						startBackupSys($srv)
						;Exit
					EndIf ; EndIf: $line_srv
				EndIf ; EndIf: $file_srv = -1
				FileClose($file_srv)
			EndIf ; Endif: $line_conn 
		EndIf ; Endif: $file_conn = -1
		; -------------------------------------------------------------------
		; Close Connection
		; -------------------------------------------------------------------
		FileClose($file_conn)
		mySleep()
	; EndIf of "Test Connection"
	EndIf
WEnd

; ###########################################################################
; Functions
; ###########################################################################

; ---------------------------------------------------------------------------
; Function to check the Connection to the Remote Computer
; ---------------------------------------------------------------------------
Func chkConnection($remotepc)
    If RunWait('cmd /c ping -n 1 -l 32 ' & $remotepc & _ 
		       '| find /n " " | find "[4]" | find "=32"', "", @SW_HIDE) == 0 Then
        Return 1
    Else
        Return 0
    EndIf
EndFunc

; ---------------------------------------------------------------------------
; Function for Error Management (no @CRLF in mainmsg!!!)
; ---------------------------------------------------------------------------
Func myError($mainmsg, $submsg)
	RunWait(@ComSpec & " /c " & "echo " & $mainmsg & _ 
	        " >>" & $logfile, "", @SW_HIDE)
	ProgressSet(100, $submsg, $mainmsg)
	Sleep(5000)
EndFunc

; ---------------------------------------------------------------------------
; Function to start the Backup System
; ---------------------------------------------------------------------------
Func startBackupSys($start_srv)
	; if already running, do nothing
	If _ServiceExists("", $start_srv) Then
		If not _ServiceRunning("", $start_srv) Then
			If _StartService("", $start_srv) Then
				myError("Backup HostMonitor started", "Primary Srv NOT running")
			Else
				myError("Could not start Backup HostMonitor.", "Primary Srv NOT running")
			EndIf
		EndIf
	Else
		myError("Backup HostMonitor Service does not exist.", "Backup Srv NOT existing" & _ 
				@CRLF & "Primary Srv NOT running")
	EndIf
EndFunc

; ---------------------------------------------------------------------------
; Function to stop the Backup System
; ---------------------------------------------------------------------------
Func stopBackupSys($stop_srv)
	; if doesn't run, do nothing
	If _ServiceExists("", $stop_srv) Then
		If _ServiceRunning("", $stop_srv) Then
			If _StopService("", $stop_srv) Then
				myError("Backup HostMonitor stopped", "Primary Srv running")
			Else
				myError("Could not stop Backup HostMonitor.", "Primary Srv running")
			EndIf
		EndIf
	Else
		myError("Backup HostMonitor Service does not exist.", "Backup Srv NOT existing" & _ 
				@CRLF & "Primary Srv NOT running")
	EndIf
EndFunc

; ---------------------------------------------------------------------------
; Function to cut connection and go to sleep
; ---------------------------------------------------------------------------
Func mySleep()
	; Close Network session
	RunWait(@ComSpec & " /c " & "net use" & _ 
	        " \\" & $pc & "\IPC$ /delete", "", @SW_HIDE)
	ProgressSet(100, "Closing Connection...", "End of Task!")
	Sleep(5000)
	ProgressOff()
    ; Delete Temp Files
	If FileExists($tmp_conn) Then
		Run("cmd.exe /c" & "del /Q " & $tmp_conn, "", @SW_HIDE)
	EndIf
	If FileExists($tmp_srv) Then
		Run("cmd.exe /c" & "del /Q " & $tmp_srv, "", @SW_HIDE)
	EndIf
	Sleep($wait)
	;Exit
EndFunc

The included "ServiceControl.au3" could be found at the website of AutoIt, or have a look here http://www.autoitscript.com/forum/index ... topic=6487


This is the AutoIt script to get all relevant files, which are needed (*.hml, *.ini, scripts written by oneself, and so on...). Thus the backup HM can use the current files:

Code: Select all

; ###########################################################################
; ### Copy relevant files from primary to backup HostMonitor              ###
; ###                                                                     ###
; ### pillow-laced by ataudte                                             ###
; ###########################################################################


; ###########################################################################
; Some Variables
; ###########################################################################
; Connection
$pc          = "computer"
$user        = "Administrator"
$pw          = "password"
$connectsrc  = "\\" & $pc & "\IPC$"
$hmsource    = "\\" & $pc & "\C$\PROGRA~1\HostMonitor\"
$hmdest      = "C:\PROGRA~1\HostMonitor"
; Time
$wait        = "30000" ; milliseconds
                      ; look here to convert values:
                      ; http://www.convertworld.com/de/zeit/Millisekunden.html

; ###########################################################################
; The Program
; ###########################################################################

While 1
	ProgressOn("Synchronizing HostMonitor", "Please wait...", "Starting...", 50, 50, 16)
	Sleep(1000)
	; -----------------------------------------------------------------------
	; Timestamp & Temp File
	; -----------------------------------------------------------------------
	$mydate      = @YEAR & @MON & @MDAY
    $mytime      = @HOUR & @MIN & @SEC
    $mytimestamp = $mydate & "-" & $mytime
    $tmp_file    = "copytmp_" & $mydate & $mytime & ".txt" 
	; -----------------------------------------------------------------------
	; Logging
	; -----------------------------------------------------------------------
    ProgressSet(10,"Start Logging...")
	Sleep(1000)
	$logfolder   = "copy_logs\"
    DirCreate($logfolder)
    $logfile     = $logfolder & "copylog-" & $mydate & ".txt"
    RunWait(@ComSpec & " /c " & "echo. >>" & $logfile, "", @SW_HIDE)
	RunWait(@ComSpec & " /c " & "echo LOGS - " & $mytime & " >>" & $logfile, "", @SW_HIDE)
	RunWait(@ComSpec & " /c " & "echo ------------- >>" & $logfile, "", @SW_HIDE)
	; -----------------------------------------------------------------------
	; Test Connection
	; -----------------------------------------------------------------------
	ProgressSet(20,"Checking Connection...")
	Sleep(1000)
	If not chkConnection($pc) Then
		; Connection to System failed
		myError("Host " & $pc & " unreachable.", "Connection Error!")
	Else
		; -------------------------------------------------------------------
		; Connect to remote IPC Share
		; -------------------------------------------------------------------
		ProgressSet(25,"Connect to remote system...")
		Sleep(1000)
		RunWait(@ComSpec & " /c " & "net use \\" & $pc & "\IPC$" & _ 
		        " /user:" & $pc & "\" & $user & " " & _ 
			    $pw & " >" & $tmp_file, "", @SW_HIDE)
		$file = FileOpen($tmp_file, 0)
		If $file = -1 Then
			; Error with the Temp File
	     	myError("No File: " & $tmp_file, "Unable to open Temp File!")
		Else
			$line = FileReadLine($tmp_file, 1)
			If not $line = "The command completed successfully." Then
				myError("Connection to " & $pc & _ 
						" failed!", "Connection failed!")
			Else
				ProgressSet(30,"Connection established...")
				Sleep(1000)
				; -----------------------------------------------------------
				; Check Source
				; -----------------------------------------------------------
				ProgressSet(40,"Checking Source Data...")
				Sleep(1000)
				If not FileExists($hmsource) Then
					myError("No Dir: " & $hmsource, "Does NOT exists")
				Else
					; -------------------------------------------------------
					; Copying
					; -------------------------------------------------------
					ProgressSet(50,"Copying relevant files...")
					Sleep(1000)
					myCopy("", "*.ini", 55)
					myCopy("", "*.hml", 60)
					myCopy("", "*.lst", 65)
					myCopy("_scripts\", "*.hms", 70)
					myCopy("_scripts\", "*.bat", 75)
					myCopy("_scripts\", "*.cmd", 80)
					myCopy("_scripts\", "*.exe", 85)
				EndIf ; EndIf: not FileExists($hmsource)
			EndIf ; EndIf: not $line = "The command completed successfully."
		EndIf ; EndIf: $file = -1 Then
		; -------------------------------------------------------------------
		; Close Connection
		; -------------------------------------------------------------------
		FileClose($file)
		mySleep()
	EndIf ; Endif: "Test Connection"
WEnd

; ###########################################################################
; Functions
; ###########################################################################

; ---------------------------------------------------------------------------
; Function to check the Connection to the Remote Computer
; ---------------------------------------------------------------------------
Func chkConnection($remotepc)
    If RunWait('cmd /c ping -n 1 -l 32 ' & $remotepc & _ 
		       '| find /n " " | find "[4]" | find "=32"', "", @SW_HIDE) == 0 Then
        Return 1
    Else
        Return 0
    EndIf
EndFunc

; ---------------------------------------------------------------------------
; Function to cut connection and go to sleep
; ---------------------------------------------------------------------------
Func mySleep()
	; Close Network session
	RunWait(@ComSpec & " /c " & "net use" & _ 
	        " \\" & $pc & "\IPC$ /delete", "", @SW_HIDE)
	ProgressSet(100, "Closing Connection...", "End of Task!")
	Sleep(5000)
	ProgressOff()
    ; Delete Temp File
	If FileExists($tmp_file) Then
		Run(@ComSpec & " /c" & "del /Q " & $tmp_file, "", @SW_HIDE)
	EndIf
	Sleep($wait)
	;Exit
EndFunc

; ---------------------------------------------------------------------------
; Function for Error Management (no @CRLF in mainmsg!!!)
; ---------------------------------------------------------------------------
Func myError($mainmsg, $submsg)
	RunWait(@ComSpec & " /c " & "echo " & $mainmsg & _ 
	        " >>" & $logfile, "", @SW_HIDE)
	ProgressSet(100, $submsg, $mainmsg)
	Sleep(5000)
EndFunc

; ---------------------------------------------------------------------------
; Function for Copying
; ---------------------------------------------------------------------------
Func myCopy($folder, $filetype, $progressbar)
	ProgressSet($progressbar,"Start copying " & $filetype & " files...")
	RunWait(@ComSpec & " /c " & "xcopy " & $hmsource & $folder & $filetype & _ 
			" " & $hmdest & $folder & " /S /Y 1>>" & $logfile & " 2>&1", "", @SW_HIDE)
	If $filetype=="*.ini" or $filetype=="*.hml" or $filetype=="*.lst" Then
		If not FileExists($hmdest & $filetype) Then
			myError("No " & $filetype & " files found", "Files NOT copied")
		Else
			ProgressSet($progressbar,"Got all " & $filetype & " files...")
		    Sleep(1000)
		EndIf
	Else
		ProgressSet($progressbar,"Got all " & $filetype & " files...")
		Sleep(1000)
	EndIf
EndFunc

With the development environment of AutoIt, you can convert the script to an *.exe file. To start and stop services admin permissions are needed. Thus it is meaningful to run the compiled script as windows service. Running as a service, it runs under "local system account". This account got these permissions.
You can use RunAsSrv (http://www.pirmasoft.de/runassvc.php) to install the two *.exe files as service.
If you want to see, what is happening, set the services "allow to interact with desktop".

Okay, that's it.
Have a nice day...

ataudte
Post Reply