Protecting your Cloud (vCloud & SRM)

So one of the BIG problems at the moment is that SRM does not fully support protecting your vCloud environment.
http://www.vmware.com/support/srm/srm-releasenotes-5-1-1.html#caveats

It supports protecting your management cluster (so the vCenter servers, vCD cells, vCNS manager, vCM, DBs, etc), but it doesn’t yet protect your resource cluster….. so all those VMs you’ve deployed in your organisations under vCD – well they’re not protected by SRM!

Definitely NOT COOL if your primary site goes tits up!!

From what I can gather, this is mainly due to the way SRM work….. When you setup SRM for DR, you have to ‘pre-create’ resources at the recovery site in order to map the resources from the protected site to them (stuff like resource pools, folders, network, placeholder VMs). Unfortunately vCD likes to have full control of a resource cluster and manages all the resource itself – this basically means that the vCD cells are not aware of the objects that have been created in the recovery site for SRM. It doesn’t matter if the names are the same, what matters is the Management object Reference IDs (MoRef ID) have changed and this is what vCD uses to construct its environment…..

MoRef IDs are used to correlate objects between vCD and the underlying vSphere/vCenter layer. Any changes to these identifiers will result in the loss of functionality because vCD will not be able to manage these objects as it will not be aware of them (ie the MoRef IDs will not exist inside the vCD DB).
The use of SRM would result in a change of the MoRef ID on the vCenter Server layer, resulting in an incorrect reference in the vCD database – and so leaving the object (eg. a VM) unmanageable from a vCD perspective. I believe SRM also re-signatures the storage volumes which will also confuse vCD.

About a year ago Chris Colotti and Duncan Epping wrote an article on how vCloud DR could be achieved, this involved the clever idea of putting the resource ESXi hosts at the recovery site into the same resource cluster as the resource ESXi hosts at the protected site (but in maintenance mode as obviously it won’t see the storage located at the protected site so can’t be used by vCD). Then using vSphere HA to take the ESXi hosts out of maintenance mode to handle the recovered workloads…. However, this solution did involved manual intervention to fail over the vCD resources correctly:
http://www.yellow-bricks.com/2012/02/13/vcloud-director-infrastructure-resiliency-solution/
http://www.vmware.com/files/pdf/techpaper/vcloud-director-infrastructure-resiliency.pdf

Earlier this year, another white paper was released which described how the majority of this manual process (ie the VMware bits) could be automated using PowerCLI:
http://www.vmware.com/files/pdf/techpaper/VMware-vCloud-Directore-Infrastructure-resiliency-whitepaper.pdf

However, what’s missing is the automation of the whole storage piece – breaking the replication and making the volumes read/write….. but then I guess this is really more storage-vendor dependent! =)
I guess if the storage vendor has exposed the array to VMware using VASA then it could be possible to script the storage steps as well….! =)

Anyways, it’s been an interesting read…… and definitely a problem I see VMware sorting out for the next release of SRM!

Given how powerful PowerCLI is, I really need to find some time to learn how to use it!!

Advertisements

Back to Blogging……

So I know I said I wasn’t going to blog much in the coming weeks, but giving the fact that my jury service has been cancelled next week (court case was cancelled so Jury was dismissed due to no other court cases running) and also the fact that my current work project has been cancelled (client cancelled the contract with my company), I pretty much have quite a bit of free time!

Not to mention that I had a sleepless night as all I could think about was that I NEED to blog some of the stuff that’s floating around my head regarding VMware – just so I can put my brain at rest!

So hopefully in the upcoming weeks, I intend to blog about the experiences I’ve had over the past couple of months touching upon:

  • Changing the SSL certificates of VMware products (away from the self-signed VMware ones to a CA certified one).
  • Transact-SQL scripts for creating databases for VMware products.
  • Loadbalancing workflow that I wrote recently to automate the deployment of a loadbalancer in vCD (and hope to generalise so that others can use it).

That should basically fill out my blog for a couple of weeks due to the vast amount of information to get down on paper (or in this case on screen).

First up tomorrow (yes, procrastination doesn’t disappear even when you have some free time!) will be a brief look at how you manually setup a loadbalancer within vCD, and then hopefully I can delve into how the vCO actions can be used for each manual step and what I’ve learnt.

Oh, and as for the job hunting part….. I’m quite thankful that at the moment it seems recruitment agents are calling me up rather than me desperately calling them up! I’m positive that I will be able to find another role that will allow me to continue my VMware journey! (and if you’re a potential employer, or recruitment agent reading this – please contact me if you have any opportunities of interest!)

^_^