RAID (redundant array of independent disks, originally redundant
array of inexpensive disks) is a storage technology that combines multiple
disk drive components into a logical unit. Data is distributed across the
drives in one of several ways called "RAID levels", depending on what
level of redundancy and performance (via parallel communication) is
required.
RAID
is now used as an umbrella term for computer data storage schemes that can
divide and replicate data among multiple physical drives. The physical drives
are said to be in a RAID array,[5] which is accessed by the operating system as
one single drive. The different schemes or architectures are named by the word
RAID followed by a number (e.g., RAID 0, RAID 1). Each scheme provides a
different balance between three key goals: resiliency, performance, and capacity.
Following is a brief
textual summary of the most commonly used RAID levels.
RAID
0 (block-level striping without parity or mirroring) has no (or zero)
redundancy. It provides improved performance and additional storage but no
fault tolerance. Hence simple stripe sets are normally referred to as RAID 0.
Any drive failure destroys the array, and the likelihood of failure increases
with more drives in the array (at a minimum, catastrophic data loss is almost
twice as likely compared to single drives without RAID). A single drive failure
destroys the entire array because when data is written to a RAID 0 volume, the
data is broken into fragments called blocks. The number of blocks is dictated
by the stripe size, which is a configuration parameter of the array. The blocks
are written to their respective drives simultaneously on the same sector. This
allows smaller sections of the entire chunk of data to be read off each drive
in parallel, increasing bandwidth. RAID 0 does not implement error checking, so
any error is uncorrectable. More drives in the array means higher bandwidth,
but greater risk of data loss.
Pros: Better
performance – data replicated across drives, no storage overhead as drives are
utilized 100%
Cons: Possibility of losing entire data on failure of a single disk
Minimum Disks Required – 2
Cons: Possibility of losing entire data on failure of a single disk
Minimum Disks Required – 2
In RAID 1 (mirroring without parity
or striping), data is written identically to two drives, thereby producing a
"mirrored set"; at least two drives are required to constitute such
an array. While more constituent drives may be employed, many implementations
deal with a maximum of only two; of course, it might be possible to use such a
limited level 1 RAID itself as a constituent of a level 1 RAID, effectively
masking the limitation.[citation needed] The array continues to operate as long
as at least one drive is functioning. With appropriate operating system
support, there can be increased read performance, and only a minimal write
performance reduction; implementing RAID 1 with a separate controller for each
drive in order to perform simultaneous reads (and writes) is sometimes called
multiplexing (or duplexing when there are only two drives). WARNING: RAID 1 is
not necessarily safe. In PC systems most HDD are shipped by manufacturers with
"write cacheing" turned on; this gives an illusion of higher
performance but at the risk of data not being written; failure for some reason
(eg. power) can leave the two disks in an inconsistent state and even make them
unrecoverable.
Pros – Guard against
disk failure as data is replicated across disk drives
Cons – Replication creates storage overhead as the same data is copied across drives
Minimum Disks Required – 2
Cons – Replication creates storage overhead as the same data is copied across drives
Minimum Disks Required – 2
In RAID 2 (bit-level striping with
dedicated Hamming-code parity), all disk spindle rotation is synchronized, and
data is striped such that each sequential bit is on a different drive.
Hamming-code parity is calculated across corresponding bits and stored on at
least one parity drive.
In RAID 3 (byte-level striping with
dedicated parity), all disk spindle rotation is synchronized, and data is
striped so each sequential byte is on a different drive. Parity is calculated
across corresponding bytes and stored on a dedicated parity drive.
RAID 4 (block-level striping with
dedicated parity) is identical to RAID 5, but confines all parity
data to a single drive. In this setup, files may be distributed between
multiple drives. Each drive operates independently, allowing I/O requests to be
performed in parallel. However, the use of a dedicated parity drive could
create a performance bottleneck; because the parity data must be written to a
single, dedicated parity drive for each block of non-parity data, the overall
write performance may depend a great deal on the performance of this parity
drive.
Pros – Reduced storage overhead (actually we need only a single disk here to
store parity). E.g. if you have 3 disks, parity can be stored on 3rd. So your
overhead is only 33% in terms of storage.
Cons – Still suffers from a performance perspective
Minimum Disks Required – 3 (anyways 2 doesn’t make sense and the more no. of disks you have lesser would be your storage overhead)
Cons – Still suffers from a performance perspective
Minimum Disks Required – 3 (anyways 2 doesn’t make sense and the more no. of disks you have lesser would be your storage overhead)
RAID 5 (block-level striping with
distributed parity) distributes parity along with the data and requires all
drives but one to be present to operate; the array is not destroyed by a single
drive failure. Upon drive failure, any subsequent reads can be calculated from
the distributed parity such that the drive failure is masked from the end user.
However, a single drive failure results in reduced performance of the entire
array until the failed drive has been replaced and the associated data rebuilt.
Additionally, there is the potentially disastrous RAID 5 write hole. RAID 5
requires at least three disks.
Pros – Good Performance, Good failure protection
Cons – Not as good when your requirement is only performance or only failure protection (parity doesn’t come for free).
Minimum Disks Required – 3
Cons – Not as good when your requirement is only performance or only failure protection (parity doesn’t come for free).
Minimum Disks Required – 3
RAID 6 (block-level striping with
double distributed parity) provides fault tolerance of two drive failures; the
array continues to operate with up to two failed drives. This makes larger RAID
groups more practical, especially for high-availability systems. This becomes
increasingly important as large-capacity drives lengthen the time needed to
recover from the failure of a single drive. Single-parity RAID levels are as
vulnerable to data loss as a RAID 0 array until the failed drive is replaced
and its data rebuilt; the larger the drive, the longer the rebuild takes.
Double parity gives additional time to rebuild the array without the data being
at risk if a single additional drive fails before the rebuild is complete. Like
RAID 5, a single drive failure results in reduced performance of the entire
array until the failed drive has been replaced and the associated data rebuilt.
RAID Comparison
RAID level
|
Min disks
|
Available storage capacity (%)
|
Read performance
|
Write performance
|
Write penalty
|
Protection
|
1
|
2
|
50
|
Better than single disk
|
Slower than single disk, because every write must be committed to all
disks
|
Moderate
|
Mirror
|
1+0
|
4
|
50
|
Good
|
Good
|
Moderate
|
Mirror
|
3
|
3
|
[(n-1)/n]*100
|
Fair for random reads and good
for sequential reads
|
Poor to fair for small random writes
fair for large, sequential writes
|
High
|
Parity
(Supports single disk failure)
|
5
|
3
|
[(n-1)/n]*100
|
Good for random and sequential
reads
|
Fair for random and sequential writes
|
High
|
Parity
(Supports single disk failure)
|
6
|
4
|
[(n-2)/n]*100
|
Good for random and sequential reads
|
Poor to fair for random and sequential writes
|
Very High
|
Parity
(Supports two disk failures)
|
where n = number of disks
RAID 0+1: striped sets in a mirrored set (minimum four
drives; even number of drives) provides fault tolerance and improved
performance but increases complexity.
The key difference from RAID 1+0 is that RAID 0+1 creates a
second striped set to mirror a primary striped set. The array continues to
operate with one or more drives failed in the same mirror set, but if drives
fail on both sides of the mirror the data on the RAID system is lost.
RAID 1+0: (a.k.a.
RAID 10) mirrored sets in a striped set (minimum four drives; even number of
drives) provides fault tolerance and improved performance but increases
complexity.
The key difference from RAID 0+1 is that RAID 1+0 creates a
striped set from a series of mirrored drives. The array can sustain multiple
drive losses so long as no mirror loses all its drives.
Pros: Best in terms
of performance & guards against potential failures.
Cons: Costly in terms of storage overhead
Minimum Disks Required – 4 (and for > 4 you must have even number of disks)
Cons: Costly in terms of storage overhead
Minimum Disks Required – 4 (and for > 4 you must have even number of disks)
RAID 5+3: mirrored
striped set with distributed parity (some manufacturers label this as RAID 53).
Whether an array runs as RAID 0+1 or RAID 1+0 in practice is
often determined by the evolution of the storage system. A RAID controller
might support upgrading a RAID 1 array to a RAID 1+0 array on the fly, but
require a lengthy off-line rebuild to upgrade from RAID 1 to RAID 0+1. With
nested arrays, sometimes the path of least disruption prevails over achieving
the preferred configuration.
• In RAID 5,
every write (update) to a disk manifests as four I/O operations (2 disk reads
and 2 disk writes)
• In RAID 6,
every write (update) to a disk manifests as six I/O operations (3 disk reads
and 3 disk writes)
• In RAID 1,
every write manifests as two I/O operations (2 disk writes)
Source : http://en.wikipedia.org/wiki/RAID
Post a Comment