Stateless Computing from Cisco UCS
Posted by craig
- on February 5th, 2010 in Data Center, Hardware, Server | 7 Comments »

I had been busy with the demo on Cisco UCS with VMware and EMC recently. 1 of the key features I had tested was about the Stateless Computing claim by Cisco. As a server or system guys, we all know that the ability to maintain the system state with UUID, Firmware, BIOS, LAN & SAN are important during a hardware changes. Honest speaking, the demo performed last week was my 1st experience on the stateless computing. The outcome was impressive.
To enjoy the benefit of stateless computing, you must ensure that the server or system are boot from SAN. We simulated the blade failure during the demo, and we re-associated the service profile to any of the available blade within the UCS cluster, and power it on. The entire servers are back online which maintain all the necessary system state and no re-configuration was required include LAN, SAN, Hostname and etc. The entire process took less than 20 mins and the system was back online. We simulated few round with different blade but same images and profiles, and it work exactly as it promises. Stateless computing are the key features which meant for the IT administrator to minimize the system down time in the event of hardware failure & system compatibilities.
If you have a hardware failure on the current blade server, which the hardware will be replaced in the next 4 hours, nothing much could be done on the system administration to resume the production system without stateless computing. With stateless computing in place, you can easily shutdown any of your test & Development systems, and transfer the service profiles to allow the production system back online in less than 20 mins. This will be helpful to improve the SLA and system uptime in the event of hardware failure.

7 Responses
I heard a news, Cisco will support hardware iSCSI. Then this will also support stateless blade.
This should require the hardware iscsi adaptor able to boot directly from the SAN storage. As long as this can be done, I do not see any issue that the stateless blade with hardware iscsi couldn’t work
Is this Server profile failover automatic? or is it a manual process? for example Egenera PAN manager treats all blades as stateless just like uCS and they SAN boot. But the key difference is when there is a blade failure PAN Manager detects this and “automatically” repurposes that server profile onto a dedicated failover blade, or use a blade from pool, It can even shutdown low-priority servers and use those processing resources. I don’t see Cisco UCS doing any of this?
It can be done exactly same as your suggestion above. I am not familiar with PAN manager, but do take note the UCS server profile is also contain the UUID, BIOS, Firmware, MAC and WWNN information which you need to retain those information to perform a full stateless. Some Software vendor will detect the changes and retire the existing activated license key. In those cases, the stateless become useless. Just my 2 cents.
Craig,
In UCS its not an automated process its manual, you can deploy a perl script but it has limitations on what it can actually do see https://supportforums.cisco.com/thread/2063174 Cisco state “this feature of full blown blade/SP failover has not made it in the list of upcoming features.” and then go onto say “I will not go so far as to say it will be never, but its not in the immediate timeframe.” this was dated Jan 2012. So you’ll need to use the same old method of buying extra hardware and using clustering software to provide highly available applications, and not taking the full advantages of what stateless computing resources can truly have to offer.
for more details about what PAN does see http://www.youtube.com/watch?v=-6xSI6kBfX0 and http://www.youtube.com/watch?v=aWau0r7jxHA since the “state” is all in software and pushed out to a blade as and when its deployed I don’t see how a Software vendor will detect the changes and retire the existing activated license key? we’ve been using this approach for the last several years and never ran into a situation where licenses have been an issue. Even when we go from a blade with 4 cores to 8 cores for example.
Cheers
Martin.
Thanks for the great information sharing here. Highly Appreciate it as I do not have any experience on PAN.
Sorry for my mistake, I double confirm with my counterpart again, the fail-over option is not automated as the last test we done is base on pooling option as suggested in the post, which I thought it was automated.
For the software license issue, my personal experience on ERP which will require license to be regenerated while we migrate the state of the machine to a different hardware, mainly due to the UUID change. If PAN can retain the hardware identity as similar to service profiles, then this should be fine.
Hi Craig,
Your point of UUID’s is a great one, PAN does migrate the UUID’s. All this data is stored in an XML configuration which can be zipped up and migrated to another site for disaster recovery purposes.
This allows you to have your DR site not just sitting there doing nothing it could be used for Dev/Test and your production sites physical and virtual servers (which could be on a mixture of Xen, VMware, Hyper-V, etc) can all be moved across using the “stateless” blade approach.
Obviously this depends on your production data being replicated to the DR facility, but the savings can be huge in terms of physical hardware cost, software licenses, etc. The best thing with this approach is you use the same DR process for physical and/or virtual servers. This is all thanks to stateless blades.
I love the site BTW its a great resource for information!
Regards
Martin.