For a number of businesses the conversation around business continuity and disaster recovery remains a difficult one. Many companies have already invested the capitol in an all virtualized server farm set up in an N+1 configuration with weekly backups being taken offsite. So with the IT staff is already pulled thin it’s easy to ask why another set of expensive servers – sitting doing nothing – is needed. It can make a BC/DR solution feel like buying flood insurance in Arizona. The unfortunate reality, however, is that floods do happen.
It’s no surprise, then, that cloud DR has become very alluring with all that it can offer. No upfront server costs, the remote infrastructure is already in place and there may already be workloads in the cloud. The one thing that could keep it from being a slam dunk is the management of it all. Thankfully the number of tools available to manage BC/DR in the cloud is growing regularly, providing automation and numerous connectivity options. Microsoft’s entry into BC/DR management, Site Recovery, is no different, offering familiar automation tools and native replication for a host of Microsoft products, like Exchange and SQL.
Far from being a simple DR site, Site Recovery is a management tool that provides the options of replicating local workloads to the cloud, from the cloud to a local site or even from one local site to another. The service is hosted in Azure, but Azure storage and VMs are not required.
The service can work with individual VM’s – including VMWare, however if Hyper-V VMM is running locally the replication agent can do all its work through VMM.
Since the replication is done at the VM level, there is no requirement to have servers running at the replication site until they are needed. The VM image updates are stored in VHDs which will be joined to VMs in the event of a failover. For Site Recovery using Azure this translates into no monthly compute charge unless a failover event occurs. There is a per-VM monthly charge for the service and the storage consumed, but that is it. The fees are dependent on the level (if any) of the Volume Licensing agreement with Microsoft.
The initial time to get a VM ready for failover depends on the size of the VHD. It can be uploaded via the web or the initial replication image can be mailed to a datacenter for local upload. From there just the delta changes would go across the internet. When transferring across the wire, Site Recovery data is encrypted and there is also an option to encrypt the data at rest.
Once the VMs are ready to be included in a failover strategy, Site Recovery can be used to monitor for failover events and then automate the process based on steps detailed in a failover plan. This can be as simple as bringing up a single server to cover a failed host to orchestrating an entire environment transition than includes shutting down servers in the primary data center and bringing them up in the recovery site. The startup sequence can even be specified.
Of course a failover plan is only useful if it actually works, and the only way to ensure it works is to test it. This is the most overlooked step in almost any backup solution (even simple data backups). Site Recovery makes this step almost easy by leveraging the flexibility of Azure Virtual Networks. The testing can even be put on a schedule. When a test is started Site Recovery will spin up the failover VMs in an isolated network so they can be running at the same time as production. Users can then connect to the recovered servers and verify everything is working. When the test is complete, Site Recovery will tear down the VMs so there is no manual cleanup required. Depending on the workloads being tested there may be some additional steps involved, for example if SQL is running in an HA group, but nothing insurmountable.
With all of the features it boasts, Site Recovery can stand on its own as a robust BC/DR package, but as the blog title suggests Site Recovery allows for more than just DR. Once a server is replicated in Azure it is a very simple process to spin up a new VM based on the image. Need a place to test a new rollout or track down a bug with production data? Just spin up a few new servers in an isolated network (the same as a failover test) and you have it. A new product release is stressing your web site and you need some additional temporary coverage? Done. In fact, any situation where extra compute would be needed can be addressed by spinning up servers from the replicas.
The last, and possibly most powerful, option to cover is migration. As companies contemplate what workloads they can shift to the cloud, there will need to be a plan for migration. Site recovery removes a lot of the unknowns from that discussion. Once Site Recovery is working, simply spin up a new production environment off of the replicated hosts and redirect traffic to Azure. The local instance can now become the DR site and Site Recovery can be used to replicate down from Azure by reversing the replication direction. Not only does this provide a quick migration strategy, if the migration doesn’t go as planned the fallback is already in place and proven. It has even been kept up to date by Site Recovery.
If you want to learn more about what’s involved in setting up and working with Site Recovery, the pricing and documentation details can be found at https://azure.microsoft.com/en-us/services/site-recovery/