Went through a marathon troubleshooting with my client for the past 1 week to figure out the reason why a cluster failed in place and it could not easily rebuild back after 1 of the node evicted. We had gone through the process to re image both nodes and rebuild from scratch. Follow exactly the best practices and run through the cluster validation without any error. While we try to form the cluster, the system keep provide an unknown error which do not share much information from log. It just keep telling you that the node is not reachable or unauthorized due to security setting.

After few nights of troubleshooting, I was running out of clue. Came to the sudden, I accidentally search the computer name in AD under the category of Users object group, and I found an user account been created in AD with the same name as we define on the cluster name. I was wondering whether this could caused the confuse to the system. Therefore, I was suggesting to remove the user name temporally as it was not use at the moment and tried to reform the cluster. Guess what, the cluster form up as it needed to be in less than 1 minutes. We were so happy to end the marathon troubleshooting every night well and we were also very pissed off with the bugs we face here.

I am not very sure what is the real reason behind can really cause this, but this is the real case which take us few days to figure out. I think Microsoft should seriously look into this problem as it sound stupid to have this bugs in place today. User name and computer object name are always not the same thing in AD, how can the system confuse with it?If this is unavoidable, they should put in to the documentation or check list to remind the users on this. My personal comment on Windows 2008 Cluster technology, it does not make the administrator life simple, and it added too much dependency for the Microsoft AD. Please take note that this problem happen to both windows 2008 and windows 2008 R2.