Check all HD volumes for free space (variable threshold)

If you have information, script, utility, or idea that can be useful for HostMonitor community, you welcome to share information in this forum.
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

Alex,
Good for you! I didn't realize you guys were 'listening in'.
As you can tell, it's nothing more than an elaborate WMI query.

Paul,
Can you elaborate on what led you to institute the "non-simultaneous execution" option?

Regards,
Joel
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

BTW: HostMonitor always use "non-simultaneously test execution" option for Active Script (not Shell Script) test method due to some problem with Microsoft scripting engine.

Regards
Alex
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

1. I like what you've done, but what did you think of my idea to 'bury' the passing of 'Bad Argument' results in the ParseArg function?
Ah! I now see where you were going with that comment. I hadn't thought too much about it because the bad parameter test was a single line in the if statement, until I added the extra test on line 84. I will re-visit that one.
2. Good find on the "shell script" handles issue. I think I read somewhere that the actual limit is 6, but it probably starts choking before then.
I found this because we have 60 servers to check and we are still running the eval license - the monitor stops periodically and when you re-start it all the disk test run at once. This led to about 30 servers timing out and reporting errors, so I changed the notification to run after 2 bad results and changed the foldre properties - we already had the disk tests in a site folder.

I will check out Active Script and see if the test can run there, rather than Shell Script.
[Edit] The reason I didn't use Active Script is you cannot pass parameters to the test. Shell Script has a nice little "parameters" box. Maybe a feature request?

cheers, Paul
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

Joel,
I have posted the code on page 1 to save multiple pages of code.

The ParseArg function now does all of the checking and reporting, although it does make it a little more cumbersome having to drop through multiple if statements.

KS-Soft, I noticed that the "correct" response to enable Log Analyzer to work is "value space units", e.g. "10 GB". Does the %laststatus% - %status% test take this into account?

cheers, Paul
KS-Soft
Posts: 12869
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

KS-Soft, I noticed that the "correct" response to enable Log Analyzer to work is "value space units", e.g. "10 GB". Does the %laststatus% - %status% test take this into account?
Sorry, I do not understand the question. %Status% and %LastStatus% variables represent status of the test, not reply value. Status cannot be "10 GB", it can be "Ok", "Bad", "Host is alive", "No answer".

If you are asking about %Reply% variable (not %Status%) that you want to use in some expressions, then make sure you are using quotation marks for the variable ("10 Gb" will be converted to the number, 10 Gb will be not)

Regards
Alex
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

Sorry, my bad.
I can see now I can use %Reply% or %Reply_Number%.
I will modify the script to always return the value type, in line with the Log Analyzer requirements.

cheers, Paul
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

Paul,
Good idea editting your Page 1 entry. I may get rid of the clutter and edit my original post to point down to yours.

NOTE: Your Syntax leads one to believe you can enter "15%" as a threshold value. Because the percent sign is a special character you MUST enter it as "15%%" or ".15". Otherwise the "%" is ignored and the value is interpreted as a physical threshold of 15GB.

Of course your comment about the awkward If,Elseif,Else iterations got me to thinking. I've also been reading up on VBScript/WMI optimization techniques and I've got a concept that takes on both of those issues (I'd love for this to be a bit faster, wouldn't you?).

I'm still noodling on it, and I'll only post new code if it's a more elegant solution with a performance benefit. I'm slammed for the rest of this week, but stay tuned for future developments. :wink:
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

greyhat64 wrote:NOTE: Your Syntax leads one to believe you can enter "15%" as a threshold value.
This is the case with my test set up - maybe it is because I run a Shell Script??
Parameters: MHDC08 C:2 20%
Response: C:3.1(31%), D:71%(4.28), E:98%(57.49)

cheers, Paul
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

A little bit of code tidying in the parse function to remove superfluous "if ParseOK" tests.

cheers, Paul
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

Good updates!
I'm still noodling out the whole %% thing. Maybe you're right about the shell script thing - I haven't had a chance to look too hard at it.

In truth, I'm having too much fun doing a total rewrite. :D
I've already found an interesting way to parse arguments that I should've thought of long before, not to mention a more streamlined method of error handling. Now I'm working on a different way to approach the five different 'Test types' which should reduce lines of code significantly, assuming I haven't forgotten something stupid, like error handling. :roll:
I'm almost ready, and I plan to 'flood test' my new script against 30-40 machines. I'll only post it (in place of the original code posted) if it performs better in a heavy load situation. We'll see. Right now I'm just having fun learning some new coding methods! 8)

Stay tuned.
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

Now that we have a string of disk tests running I have found a problem if the machine being queried has a wayward WMI service. In our example the WMI service seems to be hung and has to be killed before it can be re-started. This causes the disk test to hang and we now have dozens of "cscript" process running on the HostMonitor server - most of these processes are days old. The workaround for the tme being is to kill these "hung" processes, but doing this manually is painful. I have thrown together a bit of code to kill these processes, that can be run from a test in HM.

cheers, Paul
  1. Select the "Shell Script" test method.
  2. Click on the "Script Manager" icon.
  3. Select "New" for a new script.
  4. Name the script appropriately.
  5. Check the "Start Command" is set to: cmd /c cscript /B /E:VBScript %Script% %Params%
  6. Copy and paste the VBS file into the script box.
  7. Add a hint, e.g. No parameters required, change code to set values.

Code: Select all

Option Explicit
Dim objWMIService, objProcess, colProcess
Dim strComputer, strProcessKill, strHoursToLive, strProcess
Dim strMonth, strDay, strHour, strMinute, strSecond, strDate, strResult
Dim dDate
'## User deined values ##
strComputer = "."
strProcess = "cscript.exe"
strHoursToLive = 8
'## End user defined values ##

const statusAlive     ="scriptRes:Host is alive:"
const statusDead    ="scriptRes:No answer:"
const statusUnknown   ="scriptRes:Unknown:"
const statusUnknownHost ="scriptRes:Unknown host:"
const statusOk      ="scriptRes:Ok:"
const statusBad     ="scriptRes:Bad:"
const statusBadContents ="scriptRes:Bad contents:"

dDate = Now()

' Set date format to allow comparison
If Len(Month(dDate)) = 1 Then
	strMonth = 0 & Month(dDate)
Else
	strMonth = Month(dDate)
End If
If Len(Day(dDate)) = 1 Then
	strDay = 0 & Day(dDate)
Else
	strDay = Day(dDate)
End If
If Len(Hour(dDate)) = 1 Then
	strHour = 0 & Hour(dDate)
Else
	strHour = Hour(dDate)
End If
If Len(Minute(dDate)) = 1 Then
	strMinute = 0 & Minute(dDate)
Else
	strMinute = Minute(dDate)
End If
If Len(Second(dDate)) = 1 Then
	strSecond = 0 & Second(dDate)
Else
	strSecond = Second(dDate)
End If
strDate = Year(dDate) & strMonth & strDay & strHour & strMinute & strSecond

Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" _
& strComputer & "\root\cimv2")

Set colProcess = objWMIService.ExecQuery ("Select * from Win32_Process Where Caption='" & strProcess & "'")
For Each objProcess in colProcess
	if ConvertDateToNumber(strDate) - ConvertDateToNumber(Left(objProcess.CreationDate,14)) _
	> strHoursToLive/24 Then 'Process has been running for over X hours
		objProcess.Terminate()
		strResult = strResult & "Terminated process " & objProcess.Name & ", PID," & objProcess.Handle & " "
	End If
Next
if strResult = "" then strResult = "No processes to kill"
strResult = statusOk & strResult
wscript.stdout.write strResult

' ########## End of script ###########

Function ConvertDateToNumber(strConvert)
' Converts a string in the form yyyymmddhhmmss to yymmdd.part_of_day_as_decimal
' Dates can now be added or subtracted
	Dim sHr, SMin, sSec
	
	sSec = Mid(strConvert, 13, 2)
	sMin = Mid(strConvert, 11, 2)
	sHr = Mid(strConvert, 9, 2)
	if sSec > 0 then sSec = sSec / 86400
	if sMin > 0 then sMin = sMin / 3600
	if sHr > 0 then sHr = sHr / 24

	ConvertDateToNumber = Left(strConvert, 8) + sHr + sMin + sSec
End Function 'ConvertDateToNumber
[Edit 23/7/2009] Error in Month checking. "str = Month(dDate)" should be "strMonth = Month(dDate)".
Not a problem until October. :oops:
Last edited by Paul_NHS on Mon Aug 03, 2009 7:51 am, edited 2 times in total.
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

Paul,
A nice solution to a sticky problem!
Two thoughts:
1. Maybe a master/dependant relationship should be made between this script and any WMI dependant shell script.
2. Or you could make this a called function of any WMI script you develop.

On a bigger scale, shouldn't AHM either:
XXXXX(a) Timeout: Maybe the Shell Script method should have a default timeout which flag 'Unknown' any unresponsive script.
XXXXX(b) AutoCorrect: Identify when a system has become unresponsive to WMI (or any other) 'shell' request and launch a specified corrective action.
XXXXXAlex, maybe you would like to chime in here :)

Any clue as to why WMI is hanging in the first place? And don't tell me it's because it's a M$ API - we all know that :lol:
Paul_NHS
Posts: 59
Joined: Wed Feb 25, 2009 6:17 am

Post by Paul_NHS »

Joel,

1. Yes, but it was a quick and dirty....
2. If the WMI script stalls I don't think it will self clean, so it may need to be left stand alone.

(a) We do get "Unknown" in response to the timeout, which I have modified to turn into a warning status.
('%SuggestedStatus%'=='Unknown') and ((%SuggestedRecurrences%==4) or ('%Status%'=='Warning'))
(b) See my solution to (a) :)

The problem seems to be a stalled SMS Agent Host service (ccmexec.exe). I have to kill both processes before either will re-start. Should have some time to work on it next week.

cheers, Paul
User avatar
greyhat64
Posts: 246
Joined: Fri Mar 14, 2008 9:10 am
Location: USA

Post by greyhat64 »

2. True, unless you call the 'clean up' function before the WMI call. The problem with this 'solution' is that it's frequently an unneccessary call, just taking up clock cycles. I'm not a fan of unnecessary calls.
(a) So, if an 'Unknown' status is returned, you should be able to create a test using your Shell Script and make it a dependant of all other WMI shell script tests with the "Perform test when Master test has a 'Dead' or 'Unknown' status" criteria.
On second thought, the question then is how to pass the hostname/IP of the unresponsive host to your script. UGH!

As for problems between the SMS Agent Host service and WMI, there are a number of references to problems stemming from M$ updates and certain antivirus products. Good luck!
jivetolkein
Posts: 96
Joined: Thu Jul 19, 2007 4:35 am

Post by jivetolkein »

This is some great work here :-D

I never looked any further into this, as HM was originally an interim solution (which has now grown over 700 nodes ^^), each of our sites had agreed SLTs, and the volume level warnings were set here (so I'd no mileage in individual volume thresholds, other than ignoring some), and the output goes into Oracle for logging and reporting purposes, so I didn't think about tying it in with the report generator. I was going to revisit this sometime after the immediate fires had been damped down a bit.

Now I won't have too, cheers! :-D

Had come across the cscript issues too, though not too badly as we tend to use two RMAs per site for Windows servers - the most common issue was servers being decommissioned locally without any warning, so backlog of checks built up each in their own RMA agent server. We had some local checks and kills running on the depots, but I'd never thought of putting it in HM, don't know why!?

Anyway, my main point of posting is - who thinks this should now move to a feature request? Move Paul/greyhats work into an official HM test? I like scripts, but I think this is core requirement for most Windows shops, that perhaps people are missing if they don't follow the forums, and maybe discounting the product without investigating thoroughly enough.
Post Reply