Yesterday, I did an APC PCNS installation for 4 unit of VMware vSphere 4 on my customer site and one of the ESX host go wild.

For some reason, I can’t login and the particular ESX host was disconnect for few minutes on vCenter server. Again, the console fly tons of weird error message for few minutes as screen shot below.

vSphere 4 console
click to enlarge.

The similar message also display on /var/log/messages as below:

Aug 20 15:43:56 MYxxxxx02 kernel: [2992240.963580] [] default_wake_function+0x0/0xf
Aug 20 15:43:56 MYxxxxx02 kernel: [2992240.987718] [] :vmnixmod:VMnix64_SwitchPost+0x2c/0x30
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.011771] [] do_gettimeofday+0x3e/0x8b
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.035843] [] audit_syscall_entry+0x18b/0x1c1
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.059990] [] compat_sys_futex+0x101/0x121
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.084052] [] ia32_sysret+0x0/0x5
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.108200]
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.132297] ftProcMon S 000aa157ab2d703d 5720 7303 7052 7304 7300 (NOTLB)
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.180762] ffff81000e85bcf8 0000000000000086 0000000000000000 0000000000000000
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.229104] ffff81000ee72af0 ffff81000f3d4280 000000000000d9df ffff81000ee72af0
Aug 20 15:43:56 MYxxxxx02 kernel: [2992241.277503] 0000000000000000 ffff81000e85bef8 ffff81000ee72af0 ffff81000e85bdd0

Something must be wrong with the vSphere 4 kernel and the incident report have been submitted to VMware Technical team. And yet I’m still waiting for the official response.

Stay tuned…….