All.Net

Fri Apr 8 06:51:40 PDT 2016

Redundancy: Data center redundancy: How many data centers are required?

Options:

Option 1: Use a single data center well protected from all identified threats.
Option 2: Use two data centers, a primary and a backup.
Option 3: Use more than two data centers distributed across the regions where the business functions.

Decision:

IF the enterprise is small and operates from a single location OR risk is Low from loss of data center internal computing capability OR the data center is unique and too expensive to duplicate, THEN Use a single data center well protected from all identified threats.
OTHERWISE IF the enterprise is medium sized OR if the enterprise has significant information processing facilities in more than one location OR if research and development are at a different facility than production operations, THEN Use two data centers, a primary and a backup.
OTHERWISE Use more than two data centers distributed across the regions where the business functions.

Basis:

Use a single data center well protected from all identified threats.
For most small businesses, the cost of redundant data centers is fairly high and there is rarely data so critical to operations that multiple data centers are justified. A better strategy is often to have backups retained off-site, perhaps in a bank safe deposit box or fireproof media rated safe. If and when disaster strikes, the backups can be used for recovery without high losses to the business. The cost is low, the consequences are relatively low, and the resources are spent if and when recovery is needed. Of course the backup and recovery process must be tested periodically to assure that it will in fact function. Similarly, a well protected data center inside a single facility medium-sized business is adequate because there is no reason to protect the data center more than the rest of the business it supports. Finally, some data centers are very expensive and unique, such as a supercomputer center in a highly focused business. In this case the cost of duplicating such a high-valued facility might make the business infeasible, in which case the risk has to be accepted and local mitigation used to the extent feasible.

Use two data centers, a primary and a backup.
For medium scale businesses that are geographically diverse or highly dependent on information technology and large enterprises that are not geographically diverse, a primary and secondary data center are appropriate in order to assure continuity of operations across facility-related failures without long delays in recovery. Backups are mandatory, but these backups should be reflected in a timely fashion in the backup facility so that recovery and continuity of operations is assured at all times and within time frames that prevent serious negative consequences to the information utility of the company. The backup site should also be populated with adequate personnel to continue operations if the primary fails catastrophically and people cannot be transported to it. Putting all of the enterprise eggs in one basket or allowing single points of failure is irresponsible. A common strategy is to put research and development in one facility and operations in the other. This provides the facilities required to do proper change control and testing on realistic systems and enough computing power for effective R&D. The R&D staff may even be able to operate in an emergency for a limited time.

Use more than two data centers distributed across the regions where the business functions.
For any large enterprise that is geographically distributed or medium scale enterprise with high dependency in information technology and geographic distribution, geographically distributed data centers in major regions of operation should be in place to support critical business functions while also affording higher performance for the local area and retaining appropriate expertise in multiple facilities to continue business operations even if regional disasters or government failures take place. The larger and more distributed the company, the more opportunities there are for geographic distribution and redundancy. Not all data centers must have copies of all content. Rather, distribution of content over data centers and levels of redundancy should be determined by utility of local versions of information combined with business impacts of failures. As in the two data center case, recovery times are important to understanding the design of the redundancy. Infrastructure, and other dependencies should be considered, and personnel redundancy is critical.

In all cases: Backups facilities, backups, and backup and recovery processes should be tested and verified periodically. This is typically done at least once per year as part of business continuity planning efforts. Backups should be verified as they are taken, so that loss of backup data because of media failures should never be at issue for the short run. Redundancy is a complex subject and the exact number of redundant data centers is highly dependent on the criticality of information. Many financial institutions have five or more data centers each capable of running all financial transactions. Many companies are sufficiently diverse that they have limited redundancy for individual systems, but many facilities with data centers holding different capabilities, so that only a small fraction of the business fails if a data center is lost. Less diverse businesses and businesses undergoing data center consolidation for cost savings sometimes end up with inadequate redundancy. Some large enterprises have had complete business failure because of a single point of failure in a business critical system.