VMWare - ESXi monitoring

Need new test, action, option? Post request here.
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

VMWare - ESXi monitoring

Post by doodleman99 »

I'm a great fan of HostMonitor and have implemented it at several client sites but with VMWare dominating every infrastructure i come across, i'm struggling to pitch it as a solution to my boss due to the lack of hardware monitoring.

i've asked for support in the forum before and have read a few other threads but it seems you resign yourself to defeat every time stating that you are not VMWare experts and there's nothing that can be done.

i really don't believe that you couldn't spend a little time looking into this and produce a couple of health check tests any hardware failures must be detectable and although it's easy for me to say, it can't be that hard can it?

All the best,
JV
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

We were able to check VMWare using SOAP test method.
What exactly parameter do you want to monitor?

Regards
Alex
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

Post by doodleman99 »

To check for hardware failures/alarms would be nice.
Secondary to that, standard CPU & Memory resource checks to monitor stress levels.
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

Post by doodleman99 »

To check for hardware failures/alarms would be nice.
Secondary to that, standard CPU & Memory resource checks to monitor stress levels.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

There are hundred classes but not all of them really available, it may depend on your server hardware.
http://www.vmware.com/support/developer ... apirefdoc/

At least CIM_Processor and CIM_Memory provides HealthState on our system.
Indicates the current health of the element. This attribute expresses the health of this element but not necessarily that of its subcomponents. The possible values are 0 to 30, where 5 means the element is entirely healthy and 30 means the element is completely non-functional. The following continuum is defined: "Non-recoverable Error" (30) - The element has completely failed, and recovery is not possible. All functionality provided by this element has been lost. "Critical Failure" (25) - The element is non-functional and recovery might not be possible. "Major Failure" (20) - The element is failing. It is possible that some or all of the functionality of this component is degraded or not working. "Minor Failure" (15) - All functionality is available but some might be degraded. "Degraded/Warning" (10) - The element is in working order and all functionality is provided. However, the element is not working to the best of its abilities. For example, the element might not be operating at optimal performance or it might be reporting recoverable errors. "OK" (5) - The element is fully functional and is operating within normal operational parameters and without error. "Unknown" (0) - The implementation cannot report on HealthState at this time. DMTF has reserved the unused portion of the continuum for additional HealthStates in the future.
>Secondary to that, standard CPU & Memory resource checks to monitor stress levels.

HostMonitor offers CPU Usage and Memory test methods. You may check host or virtual systems.

Regards
Alex
doodleman99
Posts: 38
Joined: Tue Sep 02, 2008 5:45 am

Post by doodleman99 »

There are hundred classes but not all of them really available, it may depend on your server hardware.
http://www.vmware.com/support/developer ... apirefdoc/

At least CIM_Processor and CIM_Memory provides HealthState on our system.
i have managed to get a few of those working, although as you said, not all of them.
HostMonitor offers CPU Usage and Memory test methods. You may check host or virtual systems
yes, checking the virtual machines is fine, but the ESXi boxes aren't running windows, so your CPU and Memory tests wont work.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Well, VMWare said latest ESXi does not support SNMP but unofficially you still can enable SNMP agent (edit /etc/vmware/snmp.xml file) and use Memory test to check Physical Memory.

But you are right, we should increase priority of VMWare related tasks because of ESXi...

Regards
Alex
mrw
Posts: 195
Joined: Mon Oct 08, 2012 6:11 am

Post by mrw »

Hi,
May I ask which snmp OID I can use to get free/used/total physical memory using snmp on an ESXi?
I have not managed to find that, so If you have found it please let me know which oid it is.

And on our ESXi server the only thing I can monitor is "Overall HealthState" using SOAP. And the test itself sucks because all I can check for is if the Relpy is not=5. And when that happens I don´t get any information at all on what´s the exact problem is.

So VMware should really implement snmp to allow more standard tests like CPU/disks/vdisks/raids and other more specific tests.
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

May I ask which snmp OID I can use to get free/used/total physical memory using snmp on an ESXi?
There is Memory test metod, it will find OID itself (it may use different counters depending on OS).
So VMware should really implement snmp to allow more standard tests like CPU/disks/vdisks/raids and other more specific tests.
They implemented SNMP in old versions on VMWare, later they dropped SNMP support and implemented API that does not provide useful information :(

Regards
Alex
mrw
Posts: 195
Joined: Mon Oct 08, 2012 6:11 am

Post by mrw »

The "Memory Test" doesn´t work against my ESXi hosts.
SNMP is on and working and I can get a few values from it, but the "Memory Test" doesn´t get any reply.
Any ideas why? It´s set to use snmp and the same credentials as my other snmp queries.
KS-Soft Europe
Posts: 2832
Joined: Tue May 16, 2006 4:41 am
Contact:

Post by KS-Soft Europe »

What ESXi version do you use?
mrw
Posts: 195
Joined: Mon Oct 08, 2012 6:11 am

Post by mrw »

Several different, but atleast 5.0.0, and 5.1.0
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

Physical memory check works on our ESXi 5.1.0 Build 799733
Can you check the following OIDs
1.3.6.1.4.1.2021.4.5.0
1.3.6.1.4.1.2021.4.6.0
1.3.6.1.2.1.25.2.3.1.4.6
1.3.6.1.2.1.25.2.3.1.5.6
1.3.6.1.2.1.25.2.3.1.6.6

Regards
Alex
mrw
Posts: 195
Joined: Mon Oct 08, 2012 6:11 am

Post by mrw »

These OIDs doesn´t work on any of the ESXi servers:
1.3.6.1.4.1.2021.4.5.0
1.3.6.1.4.1.2021.4.6.0

But all of these work on all ESXi servers:
1.3.6.1.2.1.25.2.3.1.4.6
1.3.6.1.2.1.25.2.3.1.5.6
1.3.6.1.2.1.25.2.3.1.6.6

But I already use those OIDs to get "Disk Space Usage" on all available diskstores that the ESXi host can use, and I cant find a "disk" that would represent physical memory when I parse all OIDs that gives me a reply. Or how would you use those?

And I have got the "Memory Test" to work on 1 ESXI of the 5 I have. but it´s uses the same VMware build as the rest. The only thing different about that specific host is the hardware. It´s not a real "server" but more of a workstation. But I hope that the values that test gives me is correct and not "Disk space usage"?
KS-Soft
Posts: 13012
Joined: Wed Apr 03, 2002 6:00 pm
Location: USA
Contact:

Post by KS-Soft »

And I have got the "Memory Test" to work on 1 ESXI of the 5 I have. but it´s uses the same VMware build as the rest. The only thing different about that specific host is the hardware. It´s not a real "server" but more of a workstation.
Sorry, we did not find how to enable memory counters. Try to ask VMWare support team...
But I hope that the values that test gives me is correct and not "Disk space usage"?
HostMonitor checks description of each counter within hrStorageDescr branch, it will not use counters related to disk volumes. It checks for 'physical memory', 'real memory', 'memory buffers' counters...
So it should report correct information.

Regards
Alex
Post Reply