If you have implemented VMware VM Monitoring (part of HA), and your virtual machine rebooted quickly, you may not know why the virtual machine rebooted. Most often this is due to a Windows bluescreen, but you may want to see why VMware HA took the action to reboot the VM. The good news is that when you tick that little box to enable HA, and enable Virtual Machine Monitoring, that a screenshot of the console is saved each time the VM is rebooted by HA.

How is VM Monitoring different to Application HA?

One of the more exciting new releases in vSphere 5.5 is the Enterprise Plus feature of Application HA – however, it is important to understand the differences between HA VM Monitoring and the new App HA system.

  • App HA is licensed only with Enterprise Plus 5.5 and above, and utilises Hyperic agents installed in a VM, reporting to a Hyperic server instance, and then rules and policies are defined in vCenter that monitor the status of specific applications such as Microsoft SQL, IIS, SharePoint and Apache, Tomcat and SpringSource – taking user-defined remedial action, including rebooting a VM.
  • HA is a cluster-wide capability where every VM is monitored for responses from VMtools (installed in every VM) at user-defined intervals. If there is no response from VMtools, and no I/O from the VM, it is assumed to be failed, and vSphere will reboot the VM.

system-failureImportant factors are that VM Monitoring with HA is cluster-wide (although you can configure per-VM settings), and monitors only if the entire VM is responding. There is no understanding of the applications installed, and an assumption is made that if there is no I/O and VMtools is also not responding, that the Operating System has failed.

READ ARTICLE:   How to design for failures

You can read the official documentation on VM Monitoring in vSphere HA is available from here.

This assumption is a big difference – with vSphere HA, it does not understand the application or the operating system, it makes a judgement about what is going on, and takes remedial action in the form of a reboot.

Why did my VM reboot?

So, if you have VM Monitoring turned on for your HA cluster, then there may be times that it does it’s job and reboots a VM. You can see what was on the console of the VM from a screenshot taken by vSphere, and placed in the same directory as the VMX file. Simply browse to your storage and download the PNG screenshot to see what nasty little error caused the VM to halt.

This is all detailed in KB 1027734

Share this knowledge