Today, I experience with a VM hung which is not able to be reset, power off or remove from the ESX farm in our production environment. During the troubleshooting, of course we start from the virtual center, which didn’t work. Follow by that, I start to using the vmware-cmd command to do a stop & stop hard, and it still do not work. At the same time, I also restarted the management service from ESX host. Once I had done that, the VM show poweroff, but in the esxtop, it will still show the VM are running. I try to register the vm to another host, and issue to power on, but it failed, due to the resources hold up by the ESX host for the specify problem VM.

The 2 simple way doesn’t work and I have to proceed further with the kill -9 option by doing the ps -ef about the PID for the VM, and it show -1 as PID, which consider abnormal.

grep VMNAME /proc/vmware/vm/*/*

This command will show the PID as well.

In normal case, you can just run the command kill -9 (PID Number)

My case, it doesn’t work. The only option now to go is to run the vm-support -X (VMID)

Please take note that this process will become a pain for you. It took me more than 25 mins for the entire process, and some how, the VM are still hung.

At the end, I Vmotion all the vm to the rest of the host and rebooted the ESX, and my problem is solved and back to normal. Somehow, it has been too much time consuming to troubleshoot this as the VM is consider critical. I will not suggest to spend too much time on troubleshoot on command if we can fix the thing faster.