Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Lecture 15: Disks and RAID

Hard drive structure and properties


disk layout, scheduling, flash vs. magnetic drives
RAID
RAID, SLED, RAID controller
striping (raid 0)
mirroring (raid 1)
parity (raid 2-4)
striped parity (raid 5)
multiple "parity" disks: Reed-Solomon encoding (raid 6)
nested RAID (0+1, 1+0)

Disk
See the lecture slides from spring 2017.

RAID overview
RAID stands for Redundant Array of Inexpensive Disks (industry prefers to read the I as Independent).

Contrast to SLED: Single Large Expensive Disk. - RAID is cheaper (economy of scale: small/cheap disks
are common) - RAID has better failure characteristics (if your SLED fails, you're out of luck)

With RAID, many disks are connected to a RAID Controller, a hardware device that manages the disks
and presents a single disk inteface to the operating system.

RAID arrays can be organized in many different ways to improve performance and reliability, below we
discuss the key ideas behind standard RAID levels.

Striping (RAID 0)
With striping, blocks of the filesystem are spread across the disks:

D1 D2 D3 D4
B1 B2 B3 B4
B5 B6 B7 B8
B9 B10 B11 B12
... ... ... ...

Advantages: - sequential read and write throughput very high: can use all N disk heads simultaneously -
stores N disks worth of data

Disadvantages: - increased risk of failure (if 1/N disks fails, the entire filesystem is lost) - if single disk
failure chance is 10%, then chance of failure for 4 disks is ~35%

Mirroring (RAID 1)
Mirroring gives redundancy by copying all data to all disks:

D1 D2 D3 D4
B1 B1 B1 B1
B2 B2 B2 B2
B3 B3 B3 B3
... ... ... ...
Advantages: - good read throughput (can read from all disks) - great failure tolerance (can recover from
(N-1)/N failures, and continue to service requests during recovery) - if single disk failure chance is 10%,
chance of failure for 4 disks is 0.01%

Disadvantages: - expensive: can only store 1 disk's worth of data - bad write throughput: writes as slow
as slowest disk

Parity (RAID 2-5)


Parity can be used instead of hamming codes to handle single known errors (by known errors, we mean
that the controller is notified when a disk fails). The parity p of bits b0 b1 b2 ... bn is simply the exclusive
or of all of the bits. In general,

p = b0 + b1 + ... + bn 0 = p + p = b0 + b1 + ... + bn + p bi = 0 + bi = b0 + b1 + ... + b(i-1) + (bi + bi) + b(i+1) + ... +


bn + p = b0 + b1 + ... + b(i-1) + 0 + b(i+1) + ... + bn + p

here + denotes exclusive or.

RAID 2 uses bit-level striping with parity. RAID 3 uses byte-level striping with a dedicated parity disk.
RAID 4 uses block-level striping with a dedicated parity disk.

D1 D2 D3 D4
B1 B2 B3 P1-3
B4 B5 B6 P4-6
B7 B8 B9 P7-9
B10 B11 B12 P10-12
... ... ... ...

RAID 4 requires every write to access the parity disk, which can cause more wear on the parity disk.
RAID 5 stripes the parity across all of the disks:

D1 D2 D3 D4

B1 B2 B3 P1-3
B4 B5 P4-6 B6
B7 P7-9 B8 B9
P10-12 B10 B11 B12
B13 B14 B15 P13-15
B17 B18 P17-19 B19
... ... ... ...

Advantages of RAID 5:

Good read throughput (N-1) times the single disk throughput


Reasonably good failure tolerance (tolerates 1 of N failures). If single-disk failure rate is 10%, then
RAID 5 failure rate is 1%.
Good overhead: N disks can hold (N-1) disks worth of data.

Disadvantages of RAID 5:

Bad write throughput (write requires other parts of the stripe, computing parity, and performing two
writes)

Reed-Solomon encoding (RAID 6)


As disks become large, the recovery time takes longer and longer, increasing the probability of two
simultaneous failures. Two simultaneous failures will completely destroy a RAID 5 array.
Reed-Solomon codes generalize parity to allow two or more parity bits to be computed. With two bits of
parity, one can correct two known failures. This allows two simultaneous failures to be handled.

RAID 6 uses two striped parity blocks.

Nested RAID
One can also "nest" different RAID levels by using a separate RAID controller instead of a disk inside of a
RAID controller. That is, multiple RAID 0 controllers can be plugged into a single RAID 1 controller,
providing some of the benefits of striping and some of the benefits of mirroring. This arrangement is
called RAID 0+1 or RAID 01.

Similarly, multiple RAID 1 arrays can be placed in a single RAID 0 array. This is called RAID 1+0 or RAID
10.

Advantages and disadvantages of these RAID levels are left as an exercise.

You might also like