Professional Documents
Culture Documents
16bit0005 VL2018195004509 Pe003 PDF
16bit0005 VL2018195004509 Pe003 PDF
By
Ashish Tiwari -16BIT0005
Submitted To
PROF. SIVA RAMA KRISHNAN S
April, 2019
ABSTRACT
With the increasing amounts of data produced worldwide every day, storing the data
becomes a major issue. Many secure systems like online banking and other government
services, the need for highly reliable and available online storage has increased. While
most of these systems depend on the cloud for data storage, cloud computing comes with
many disadvantages like slow data access and risk of the cloud server downtime. In
addition, maintaining data privacy is another issue.
RAID, or Redundant arrays of independent disks provide benefits which enable users to
circumvent these issues. RAID involves combining drives in various patterns and using
them instead of operating them individually. Three metrics govern the choice of which
RAID level is chosen: performance, reliability, and feasibility. Apart from RAID 0, the
other RAIDs provide redundancy, thus improving data reliability and performance,
especially the read- speeds. The primary benefits of using RAID are performance
improvements, resiliency and low costs.
RAID 0:
RAID 0 involves striping of data across two or more devices. Striping the data breaks it up
into chunks which is then written to each disk in the array. By using multiple disks, this level
offers superior input-output performance which can be further increased merely by using
multiple controllers.
Advantages:
Disadvantage:
Not fault-tolerant. Since data is split up, failure of a single device will bring down the
total system
Ideal use:
RAID 0 is best for storage of data that have to be written at high speed such as video editing
or image retouching
RAID 1:
In RAID 1, data is stored twice(Mirroring) as they are written to a set of data drives along
with a mirror drive. In case of a drive failure, the controller uses either of the data drives for
data recovery.
Advantages:
RAID 1 is best for mission critical storage and is suitable for small servers.
RAID 2:
RAID 2 implementation is carried out through splitting facts on the basis of bits and
spreading it over some of records disks and a number of redundancy disks. In RAID 2,
information isn't stripped at blocks, but at the extent of bits. This is very efficient in
ascertaining the single bit corruption. As a result, this RAID degree affords a very excessive
statistics transfer rate.
Advantages:
RAID 3:
RAID 3 makes use of striping at the byte stage and stores committed parity bits on a separate
disk power. RAID 3 requires a special controller that allows for the synchronized spinning of
all disks. It stripes the bits, which might be stored on one-of-a-kind disk drives. This
configuration is used less usually than other RAID ranges.
Advantages:
Transaction rate equal to that of a single disk power at first-rate (if spindles
are synchronized)
Controller design is complex.
The configuration may be an excessive amount of if a small report switch is the
most effective requirement.
Disk screw ups may drastically lower throughput.
Very difficult and useful resource in depth to do as a "software" RAID
RAID 4:
RAID 4 uses block-level data striping and a single dedicated disk for storing parity bits. It
does not require synchronized spinning, and each disk functions independently when single
data blocks are requested. This is in contrast to RAID 3, which stripes at block-level, versus
bit-level. It also does not distribute parity bits. This configuration requires at least three disks.
Data or files may be distributed among multiple, independently operating drives. This
configuration facilitates parallel input/output (I/O) request performance. However, when
parity bits are stored in a single drive for each block of data, system bottlenecks may result.
When this occurs, system performance depends on parity drive performance.
Advantages:
Advantages:
RAID 6:
RAID 6 is an extension of RAID level 5 that implements fault tolerance by using a second
independent distributed parity scheme (dual parity). Like in RAID level 5 data is striped on
block level across a set of drives, and a second set of parity is calculated and written across
all the drives; This RAID level provides data fault tolerance and can sustain multiple
simultaneous drive failure. It also gives protection against multiple bad block failures
Advantages:
RAID 7:
In RAID 7, all input-output transfers are asynchronous, independently controlled and cached
including host interface transfers. All reads and write are centrally cached via the high-speed
x-bus and dedicated parity drive can be on any channel. This is a fully implemented process
oriented real time operating system resident on embedded array control microprocessor. In
RAID 7, parity generation is integrated into the cache.
Advantages:
Overall write performance is 25% to 90% better than single spindle performance
Host interfaces are scalable for connectivity or increased host transfer bandwidth
Small reads in multi user environment have very high cache hit rate resulting in
near zero access times
Write performance improves with an increase in the number of drives in the array
No extra data transfers required for parity
manipulation Disadvantages:
Higher-order RAID configurations are also known as nested RAID configurations. They
combine multiple RAID levels within a single array. Theoretically, any RAID levels can be
combined with any other level. However, not all of these levels are meaningful to improve
system functionality. The table below is a comparison of the various nested RAID
configurations.
RAID 0, or striping, utilizes two physical hard drives. Every other block is written to
the hard drive which, in theory, could double the performance of a single hard drive.
However, this type of RAID increases the risk of a data loss, since all the data is
practically lost if one hard drive fails. This is also a reason why this is not so widely
used in homes and especially in production use in companies. RAID 1, or mirroring,
utilizes two physical hard drives. Everything is written to both hard drives, which
decreases significantly the risks of data loss. However, the capacity is only half of the
actual capacity, which makes this type of RAID to be 12 quite expensive. RAID 1 is a
commonly used in home environment. RAID 5 utilizes three or more physical hard
drives. Practically it lowers the risk of data loss, since in three disks RAID it is possible
to have a single hard disk failure without losing any data. For example, in RAID 5 with
six disks, the capacity is the sum from five disks and one of the disks contains the
necessary data to rebuild the data after a single hard disk failure. This setup is quite
common in server environment. (Thompson & Thompson, 2011, p. 142.) RAID 10 is a
stacked RAID which can be said to be RAID 0+1, RAID 1+0 or RAID 10. It uses four
hard drives which are arranged so that there are two separate RAID 1 setups, which are
emerged into one RAID 0 setup, so basically it utilizes two different RAID layers. This
setup is fairly common in server environment.
OBSERVATION:
With this kind of testing, there are so many variables that it will be difficult to
make any solid observations. But these results are interesting.
With all caching disabled, write performance is worse. And especially the
RAID levels with parity (RAID 5 and RAID 6) show a significant drop in
performance when it comes to random writes. RAID 6 write performance got so
low and erratic that I wonder if there is something wrong with the driver or the
setup. Especially the I/O latency is off-the-charts with RAID 6, so there must be
something wrong.
RAID 6 needs six write I/Os for every application-level write request. So with
450 IOPS total, divided by 6, we only have single-disk performance of 75 IOPS.
If we average the line, we do approximately get this performance, but the
latency is so erratic that it would not be usable.
Conclusion
Overall, the results seem to indicate that the actual testing itself is
realistic. We do get figures that are in tune with theoretical results.
The erratic RAID 6 write performance would need a thorough
explanation, one that I can't give.
Based on the test results, it seems that random I/O performance for a
single test file is not affected by the chunk size or stripe size of an RAID
array.
The results show to me that my benchmarking method provides a nice
basis for further testing.
Removable drives cannot be used for partition because RAID require
dynamic types of disk and removable disk cannot be converted into
dynamic because once the drive is removed all the volumes is crashed.
Cost and Performance comparison:
1.RAID 0
In RAID 0, since distribution of data across multiple disk drives is equally
sized(striping), the drives are considered to be in series. The mathematical
relationship that evaluates the reliability of a six-disk RAID-0 array is the
product of the individual HDD reliabilities .
2.RAID 1
RAID 1 uses mirroring and requires at least two HDDs to implement. In this
configuration, one HDD can fail in a paired set without loss of data. The
reliability can thus be given as:
3.RAID 0+1
In this, data is striped to one disk set and then mirrored to another disk set.
RAID-0+1 requires a minimum of four HDDs to implement. The reliability
would be:
4.RAID 1+0
In this configuration, data is striped across mirrored sets of drives. RAID 1+0
also requires a minimum of four drives to implement. The reliability would be:
5.RAID 3
The RAID 3 controller calculates parity information and stores it to a dedicated
parity HDD. This requires a minimum of 3 drives to implement. The reliability
would be:
6.RAID 4
RAID-4 is identical to RAID-3, except that it accommodates larger chunks.
Thus, the reliability is also the same:
7.RAID 5
RAID-5 is similar to RAID-4 except that the parity data is striped across all
HDDs instead of written on a dedicated HDD. Thus, the reliability is also the
same:
Refernces
[1]Delmar, Michael Graves (2003). "Data Recovery and Fault Tolerance". The
Complete Guide to Networking and Network+. Cengage Learning. p. 448.
ISBN 1-4018-3339-X.
[2]Mishra, S. K.; Vemulapalli, S. K.; Mohapatra, P. (1995). "Dual-Crosshatch
Disk Array: A
Highly Reliable Hybrid-RAID Architecture". Proceedings of the 1995
International
Conference on Parallel Processing: Volume 1. CRC Press. pp. I-146ff. ISBN 0-
8493-2615-X.
[3]Layton, Jeffrey B. (2011-01-06). "Intro to Nested-RAID: RAID-01 and
RAID-10". LinuxMag.com. Linux Magazine. Retrieved 2015-02-01.
[4]"20.2. The Z File System (ZFS)". freebsd.org. Archived from the original on
2014-07-03. Retrieved 2014-07-27.
[5]"Double Parity RAID-Z (raidz2) (Solaris ZFS Administration Guide)".
Oracle Corporation. Retrieved 2014-07-27.
[6]"Triple Parity RAIDZ (raidz3) (Solaris ZFS Administration Guide)". Oracle
Corporation. Retrieved 2014-07-27.