Disaster Recovery: What You Need to Know Before a Crisis Hits
Although disaster recovery (DR) may be one of the most important elements in a reliable infrastructure, DR is often a missed step. It is shocking that many companies opt out of implementing DR or only implement it halfway. Here, we will discuss the in’s and out’s of disaster recovery to gain a better general understanding and in respect to cloud computing and overall impact.
Terms
When discussing disaster recovery, it is important to understand the terms involved.
RTO: (recovery time objective) how long it will take to have the disaster recovery site up and operational again
RPO: (recovery point objective) the maximum amount of data loss measured in time. It is the age of the data that was backed up required to resume operations after a disaster. If the RPO is set at 1 hour then backups should occur every hour as to not exceed the RPO. The lower the RPO the better.
Regional: determine how far is too far, and how close is too close. And when is far too far and close too close. Solutions can be unviable if the disaster is too massive for the area covered and vice versa.
Cost: the cost of maintaining an effective disaster recovery plan is also a factor to be considered. It will determine whether or not to have a secondary site.
Regional VS Domain Level DR
Depending on the needs, disaster recovery can be implemented at a regional or domain level. Allocating high availability (HA) at the domain level and DR at the regional level is recommended. Speaking in cost value terms, the cloud, with its variety of providers, is a good alternative for HA and DR.
Hot, Cold, Warm?
There is a hot/cold/warm approach for DR. A hot site is one that is up and running and ready for failover immediately. A warm site is one that has the resources, but only the critical aspects are ready to go. Until things are in order a few minutes later can there be a failover into that warm DR site. A cold standby site receives updates, though not frequently. Failovers in a cold site take a longer than the desired RPO.
Is The Domain DR Worth It?
The decision to have DR at the domain level is difficult. The challenges include availability zones, power, and cooling. At least one data center can comprise a zone. One, or more commonly three zones make up a region. Zones are utilized for HA. The network connectivity between zones is very idle, a couple of hundred microseconds. The transfer rate’s magnitude can make RPO’s obsolete since the data replication is instant and everywhere. DR solutions may be necessary within the zones. In the case that it is necessary, it is called an active/passive configuration, thus the secondary site is paused.
Impact
The idea that the more responsive the secondary site is, the more expensive, is not necessarily true. It all depends on the architecture, implementation, and execution. The amount of information held in varying sites will determine the economic impact. The size of the infrastructure and replication has less to do with the impact. If the infrastructure is stopped, operations can be resumed in no time without a large impact on expenses. This will be slightly determined by the cloud provider and manager. Automation is also accounted for within economic impact. There is automated, semi-automated, and no automation at all. It is up to the user what choice of automation they decide upon.
DR Plan
The disaster recovery plan is the most vital part of general DR. The DR plan lays out the processes to be enacted when the disaster hits. It will depict who needs to do what and when to execute the action. Writing the plan is one thing, but continuous testing is key. To learn more about disaster recovery strategies, check out Rackware’s DR solution among other cloud management help.