How to Create a Good Data Center Disaster Recovery Plan
By: Kaylie Gyarmathy on June 10, 2019
Disaster can take many forms when it comes to IT infrastructure. Cyberattacks that take down networks, power outages that threaten server uptime, or natural disasters that bring down local infrastructure all have the potential to seriously disrupt networks and endanger valuable data. Fortunately, a good data center disaster recovery plan can help a facility prepare to deal with almost any eventuality.
Prepare for Anything (and Everything)
It’s dangerous to make assumptions about which disasters might pose a threat and which ones can be ignored. No one would be surprised for a data center in California to prepare for an earthquake, but it might sound like a waste of time for a facility in Indiana or Kentucky to plan for such an event. However, the United States Geological Survey (USGS) has estimated that there is a seven to ten percent chance of that region suffering a seismic event greater than 7.0 on the Richter scale sometime within the next 50 years. Those odds might not sound too worrisome, but given that a Federal Emergency Management Agency (FEMA) simulation of a 7.7 magnitude earthquake in that region found that more than 4,000 people could be killed along with thousands more injured, it would be foolish to not account for every potential risk, no matter how remote.
Aside from natural disasters, data centers need to prepare for unexpected disaster recovery risks. One of the best examples of this occurred in 2007 when a truck drove into a transformer near a data center operated by the web-hosting company Rackspace. While the facility’s backup generators kicked in, the disaster plan didn’t account for the local electric company cycling power on and off, which disrupted the cooling system and forced Rackspace to take several servers offline.
Evaluate and Prioritize Systems
In the wake of a disaster situation, there is a strong likelihood that not all systems will be able to be brought back online immediately. Some systems may be more downtime tolerant than others, while others are mission critical. This is especially important in a colocation facility, where a data center is responsible for facilitating the network services of its clients. There are SLA server uptime and compliance considerations to take into account when determining which systems should be restored first.
The time to make these determinations is before a disaster actually strikes. Data center personnel must have a clear idea of where critical assets are located within the facility and which systems have priority in the event of a data center outage. There should be no question of where to focus efforts and resources in the heat of the moment. Detailed inventories should be conducted regularly and made easily available to mitigate disaster recovery risks, ensuring that data center personnel have an accurate picture of the IT environment at all times.
Identify Power Redundancies
Most third-party data center providers incorporate extensive power redundancies. Diesel-powered backup generators ensure that even if the local power grid goes out, the generators will kick in and keep the power running. However, in those brief moments between the power cutting out and the generator starting up, battery-driven uninterruptable power supply (UPS) units provide continuous power to the facility. These systems are rated according to the amount of redundancy they provide:
N+1: Provides enough UPS units to meet all power needs plus one additional redundant unit.
2N: Provides full redundancy with double the number of UPS units needed to provide power.
2N+1: Delivers double the number of UPS units needed to cover power needs plus one additional redundant unit.
Knowing what power redundancies a facility has in place is crucial to developing a realistic plan that accounts for various disaster recovery risks. A company that operates several data centers might not have the same systems in place in every facility, so it’s important for each one to have a unique disaster plan that takes its capabilities into account.
Back Up Data
Losing access to mission-critical data is often the biggest concern companies have when it comes to migrating systems to a cloud provider or colocation facility. Given the high costs of data center downtime, it’s no wonder that data availability keeps many decision-makers up at night. A disaster event represents the worst case scenario because it could result in valuable data being lost forever. Considering that 93 percent of companies without a plan in place to deal with data loss go out of business within a year, the concerns are well-founded.
Fortunately, colocation data centers often have the geographically dispersed resources needed to provide robust data backup options for their customers. Leveraging a multi-data center strategy is one of the most important aspects of any data center disaster recovery plan, allowing a facility to keep critical data both accessible and secure in the event of a major data center outage.
The best data center disaster plan in the world won’t amount to very much if the facility doesn’t use the plan to prepare its personnel for disaster recovery risks. When disaster strikes and the lights go out, everyone should know what their responsibilities are in terms of getting systems back online. Regular emergency drills will not only evaluate response time and performance, but also identify potential weaknesses in the disaster plan that need to be addressed. From an infrastructure standpoint, integrated systems tests should also be conducted to ensure that all redundant equipment is functioning properly. Better to find out under the optimal conditions of a routine test than to learn about a failure when the equipment is needed most.
Developing an effective data center disaster recovery plan is one of the most important steps any data center will take. By taking the particular characteristics of the facility into account and identifying the most critical tasks that must be carried out, data center managers can ensure that their personnel will be able to restore key services as quickly as possible in the event of a data center outage. This is especially important for colocation facilities committed to maintaining high levels of data availability for their customers.
About Kaylie Gyarmathy
As the Marketing Manager for vXchnge, Kaylie handles the coordination and logistics of tradeshows and events. She is responsible for social media marketing and brand promotion through various outlets. She enjoys developing new ways and events to capture the attention of the vXchnge audience.