Today, I experience with a VM hung which is not able to be reset, power off or remove from the ESX farm in our production environment. During the troubleshooting, of course we start from the virtual center, which didn’t work. Follow by that, I start to using the vmware-cmd command to do a stop & stop hard, and it still do not work. At the same time, I also restarted the management service from ESX host. Once I had done that, the VM show poweroff, but in the esxtop, it will still show the VM are running. I try to register the vm to another host, and issue to power on, but it failed, due to the resources hold up by the ESX host for the specify problem VM.
To improve the compatibility of the Processors chipset during the VMotion for Virtualization, the recent release from VMware had the option of EVC for both INTEL and AMD which allow you to VMotion around all your VM even it is not running the same processor family. Previously you may faced compatible issues if you try to Vmotion from INTEL 7 series to 5 series processors family. With this new option, that problem will gone. Really thanks and appreciate to the effort from VMware as well as INTEL and AMD to bring this success and easier our life.
Here I would like to share my experience on how I get this enable without interrupt my existing production environment. By default, the features will tell you to power off all your VM before you can enable this features. Here is the tweak around solution you may want to try. Most of the time, we run critical application on VM which able to minimize the down time for us.
To have Enhanced VMotion Compatibility(EVC) working, 1st you need to create a new cluster. This cluster is not necessary to have HA/DRS running itself, because is to really allow you to move the existing production VM to the temp servers. At the same time, you can setup 1 or 2 temporally ESX Server with the evaluation edition, which have similiar configuration with your production ESX. This is to allow Vmotion happen between the ESX host.
Once you had those ready, start to VMotion all the VM out from the production ESX to the Newly build temporally ESX, until your existing production ESX cluster is empty with any of the VM power on and running.
After couple of months we had performed the patch activities for our ESX hosts and VM guests by using the Update Manager, here is my review of the Update Manager from VMware.
Update Manager had simplified the life of the system engineers who manage the VM farm with the huge number of VM guests and ESX hosts which may require a frequent patch update. Before the Update Manager released, most of the time we had patched the server by using satellite servers, Altiris, SMS and others patching tools. That will require additional cost required to be implemented on the VM guests or ESX host due to the licensing agreement from the vendor.
Update Manager is fully compatible with VMwareESX patches update for ESX 3.0, 3.5 and ESX 3i. From the Host level, you will able to get all the patches downloaded by the update manager schedule task once the VMware had officially release their patch on their official system. Update Manager had also integrated well with Microsoft patches as well as others famous software patches like Red Hat, Adobe and etc. It even allow us to patch the template image which we store for deployment purpose, without manual interaction to convert the template back to virtual machine. If you try to patch a windows 2003 template image, the entire process is fully automated. This is really impressive. I had also patch my DR servers which is 30 miles away from my major Data Center, and we had 30 Mb MPLS across the WAN, it worked perfectly without any issue at all, and of course, the patching timing will be slightly longer due to the location of the DR servers.
To get the update manager deployed in your environment, here are couple of steps you may need to configure before it could fully function.
A dedicated DB for update manager in the SQL or Oracle - Depend on the choice of database servers you are using. This Database will store all the information and patches to be used for patching purpose. If you have proxy server in your environment, you need to configure the proxy address and port number in the virtual center configuration for Update Manager. Schedule task to refresh and check the latest patches release from the official site, recommend to run the schedule task at least once in a week. I do schedule it to be run on weekly basis, to ensure you getting the latest patches when you try to patch you VM guest or ESX host. Read more »