Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

High Availability in Storage Systems

Introduction
In todays connected world, information and communication have become vital and fundamental aspects of every sphere. Be it an individual or a business, data has become the lifeblood of our daily existence. Large scale panics due to Twitter blackouts are proof to this fact of life today. For businesses, even brief down times can result in substantial losses for the business. Long term down times resulting due to various human and natural disasters can cripple a business and bring it to its knees. According to Dunn and Bradstreet, 59% of Fortune 500 companies experience a minimum of 1.6 hours of down time per week, which translates in to $46 million per year. According to Network Computing, the Meta Group and Contingency Planning Research, the typical hourly cost of downtime varies from roughly $90,000 for Media firms to about $6.5 million for Brokerage services. Thus, it becomes clear that based on the nature and the size of the business, the financial impacts of down time can vary from one end of the spectrum to another. Often, the impact of a down time cannot be predicted accurately. While there are some obvious impacts of a down time in terms of lost revenue and productivity, there can be several intangible impacts such as brand image damages that could have not-so-obvious and far-reaching effects on the business.

Disaster Recovery is not High Availability


Disaster Recovery (DR) has become the buzz words of all enterprises today. In the volatile and uncertain world of today, it becomes extremely important to plan for contingencies to protect against possible disasters. Disasters could be software related or hardware related. Software disasters could be a result of virus and other security threats such as hacking, deletion of data accidentally or with malicious intent. Hardware related disasters could be a result of various hardware components such as drives, motherboards, power supplies, or natural and man-made site disasters such as fire, flooding and other natural and man-made disasters. Different disasters need different recovery strategies while software failures can be protected using techniques such as Snapshots, Continuous Data Protection (CDP), hardware failures can be recovered by building component redundancies within the system (such as RAID, RPS) and ensuring that the data is backed up on alternate media using D2D, D2D2T backup methodologies and synchronous and asynchronous replication strategies. Thus, disaster recovery is one aspect of Business Continuity (BC) strategies that a company must employ to minimize the impacts of down time. Disaster recovery is often measured in terms of two Service Level Agreement (SLA) objectives Recovery Point Objective (RPO) and Recovery Time Objective (RTO). RPO represents the acceptable amount of data loss on a disaster, measured in time. RTO represents the amount of time within which the business must be restored after a disaster. While, a DR strategy focuses on the effectiveness of a recovery process after a disaster, it does not focus on keeping the data available without any down times. Availability is measured as the ratio of mean

time between failures (MTBF) to the sum of MTBF and mean time to repair (MTTR). Thus, availability is indicative of the percentage of time the system is available throughout its useful life. As mentioned earlier, one of the primary goals of disaster recovery strategies is to minimize the RTO (down time). Since MTTR is a measure of the down time and must meet the objectives of RTO, a comprehensive disaster recovery strategy must also encompass strategies to increase availability. Thus, while DR strategies are strictly not availability strategies, they do meet availability requirements to an extent.

Downtime

Uptime

Uptime

Downtime

Uptime

Figure 1: System Availability

Classes of Availability
Availability is often expressed as a percentage of system availability. Often an availability of about 9095% is sufficient for most applications. However, for extremely critical business data such amounts of availability is simply not enough. As mentioned before, for services such as brokerage services and businesses offering online services, a down time of the order of more than a few minutes an year could have significant impacts on its operations. For e.g. a 99.9% availability typically means about 9 hours of down time per year. The financial and other impacts of such down times could spell trouble for the business. Often, truly highly available solutions have an availability of 99.999% (five nines) or 99.9999% (six nines). Such solutions have a down time of the order of a few seconds to a couple of minutes per year. There are different classes of data protection mechanisms based on the availability. Figure 2 shows a pyramid of various data protection strategies. As one goes up the hierarchy the down time decreases and hence the availability increases. The top two levels of the pyramid constitute strategies that represent true high availability (five nines and six nines).
Figure 2: Classes of Data Protection

Active/Active Dual Controllers: SBB


The fundamental way to make a storage system highly available is to make each and every component of the system highly redundant. This includes the processors, memory modules, network and other host connectivity ports, power supplies, fans and other components. Apart from these, the drives are also configured in RAID to ensure drive failure tolerance. However, still the disk array controller (RAID controller) and motherboard of the system constitute single points of failure in the system. Storage Bridge Bay (SBB) is a specification created by a non-profit working group that defines a mechanical/electrical interface between a passive backplane drive array and the electronics packages that give the array its personality, thereby standardizing storage controller slots. One chassis could have Figure 3: Storage Bridge Bay multiple controllers that can be hot-swapped. This ability to have multiple controllers means that the system is protected against controller failures as well, thereby giving it a true high availability. But, such a configuration is not bereft of challenges. One of the primary challenges in such a system is due to the fact that it hosts two intelligent controllers within the same unit that share the common midplane and drive array. Since the drive array is shared, the two controllers must exercise a mutual exclusion policy on the drives to ensure that they dont modify the same data simultaneously causing data corruptions and inconsistencies. Thus the RAID module on the controllers must be cluster-aware to avoid such collisions and handle conflicts thereof. Further, the two controllers will have their own cache of the meta-data and data that is stored on these drives. The synchronization between these two caches needs to be maintained to ensure that one controller can resume the activities of the other controller upon its failure.

Dual Redundant Unit IP SAN

IP Network

Controller A

SAS X4 GbE Link

Controller B

SAN Volumes Cache RAID

SAS Expander Shared SAS disks PSU 1 PSU 2

Figure 4: Dual Redundant Controller unit

In order to do this Clustered RAID communication and maintain Cache Coherency, the two controllers need to have a set of (preferably) dedicated communication channels. A combination of more than one communication channels such as SAS fabric, Ethernet connections etc, could be employed here to ensure minimal performance impact and redundancies in this communication layer as well. As with all dual redundant intelligent clusters, the loss of the inter-node communication could result in the two controllers losing cache coherency. Further, as the communication is lost, each controller could try to take up the operation of its peer controller resulting in a split brain scenario. In order to handle this split

brain scenario, the two controllers also need to maintain a quorum using dedicated areas of the shared drive array to avoid conflicts and data corruptions. The key advantage of such a dual controller setup is that it is almost fully redundant with hot-swappable components. However, despite the controllers being redundant, the mid-plane connecting the controllers to the drive back-plane is still shared making it a single point of failure.

High Availability Cluster


The highest class of availability in the availability pyramid is achieved using High Availability Clusters. As mentioned before, while dual controllers are extremely robust and fault tolerant, they are still susceptible to mid-plane failures. Apart from these, since the drive array is shared between the two controllers, RAID is the only form of protection available against drive failures. High Availability Clusters are cluster of storage nodes that are implemented by having redundant storage nodes which ensure continuity of data availability despite component and a storage node failure as well. This represents the highest form of availability that is possible (six nines). In comparison to SBB based Figure 5: HA Cluster dual controller nodes, HA Clusters do not suffer from any single point of failures. In addition, since the drive arrays are not shared by the two systems, the individual systems have their own RAID configurations, thereby making the HA Cluster resilient to more drive failures than SBB setup. Unstable data center environments such as rack disturbances are common causes of premature drive failures. Dual controller nodes are more prone to system failures compared to HA Clusters in such environments. And finally, HA Client Clusters are also resilient to site failures thus making them the best (with DSM) in class availability solution.

IP-SAN

Network Switch

However, SBB based dual controller units have a lower disk count making them more power efficient and a greener solution with smaller data center footprint. HA Clusters also encounter the split brain syndrome associated with dual controller nodes. However, unlike dual controller nodes, this

iTX Storage

iTX Storage

Figure 6: DSM based HA Cluster

problem cannot be addressed using a quorum disk as the two units do not share the drive array. One way to address this problem is to have a client side device specific module (DSM) that performs the quorum action on a split brain. The DSM sits on the path of the IO and decides on the path to send the IO. In addition, it keeps track of the status of the HA Cluster, whether the two nodes are synchronized and permits a failover action from one system to another only when the two nodes are completely synchronized. The drawback in having a client-side DSM is that the HA cluster becomes dependent on the client. Also, if the clients are also clustered, then each of the clients needs to have distributed DSMs that communicate amongst each other. A clientagnostic HA Cluster can be created if we understand the reasons for a split brain scenario in HA Cluster. Typically, a split brain scenario that causes data corruption can occur in a HA Cluster when the network path between the storage nodes have failed severing the communication between the two nodes, while the client itself can access both the nodes. In this scenario, both the storage nodes will try to take-over the cluster ownership and unless the client has some way of Figure 7: Client agnostic HA Configuration knowing the right owner of the cluster, the IOs could potentially be sent to the wrong storage node causing data corruptions. Thus, a HA Cluster where the storage nodes have lost contact with each other while the connections from the client to both the storage nodes are alive is the cause of split brain scenario. As we can see in Figure 6, such a setup is not a true high availability solution as it does not provide path failover capability. Thus, it can be seen true HA setups are not prone to split bran syndromes for HA Clusters. Figure 7 shows one such network configuration that supports client-agnostic HA Cluster configurations.

True High Availability


A true highly available data center is one in which every component in the system not just the storage are highly available. This includes the client machines, application Servers such as databases, mail servers etc, network switches, and network paths from clients to the application server, and network paths from application servers to storage systems. The following figure shows a true HA setup where every component in the setup has redundancy built in to it. Thus, a failure in any single component say, a switch or a network path, does not make the system unavailable.

Figure 8: True High Availability

Summary
Thus, it can be seen true storage high availability can be only ensured when there is redundancy built in to every component of a storage sub-system. Dual redundant controller setup and HA Cluster setups are two such setups that deliver the best in class availability. Each of them have their own advantages and drawbacks, but a combination of these approaches in addition to application server and network path redundancies deliver a truly highly available data center setup.

For More Information about High Availability in Storage Systems , http://www.amiindia.co.in or sales@amiindia.co.in

You might also like