RAID is an acronym for Redundant Array of Inexpensive (or Independent) Disks. A RAID array
is a collection of drives which act as a single storage system, which can tolerate the failure of a
drive without losing data, and which can operate independently of each other.
RAID can be implemented via software (by the operating system) or via hardware (by a RAID
controller card). Software RAID is less expensive but performs poorly on busy systems.
Hardware RAID is more expensive but performs very well. Software RAID is not
recommended.
The following are the different levels of RAID and their definitions:
RAID Level 0
RAID level 0 is not redundant, hence does not truly fit the "RAID" acronym. In level 0,
data is split across all of the drives resulting in higher data throughput. Since there is no
redundancy, performance is good but the failure of one drive results in all data loss. This
is commonly known as striping.
At least two drives are required for striping. Because striping results in "spreading" the
data across all drives, the access to these files are consequently spread across all drives
resulting in each drive being less busy and hence able to handle more requests faster.
Many smaller drives perform much better than fewer larger drives.
RAID Level 1
RAID level 1 is commonly referred to as mirroring. It provides redundancy by duplicating
all data from one drive onto another "mirror image" drive. Disk drives need to paired. At
least two drives are required to implement mirroring. All data written to the RAID array
needs to be written to both drives of the pair. The writes to the pair of drives are
performed simultaneously by the RAID controller card and hence write performance is not
significantly affected by mirroring. Mirroring can improve read performance depending on
the controller card chosen. Some hardware allow applications to read two different blocks
of data from the same drive pair by allowing one drive of the pair to read one block while
the other drive of the pair reads the other block.
RAID level 1 arrays can continue to operate with the failure of any one drive per mirrored
pairs. Example: an array of four drives can survive two drive failures as long as they are
not in the same mirrored pair.
RAID Level 0-1
RAID level 0-1 is commonly referred to as RAID 10 or "striped and mirroring".
Although most expensive, RAID 10 performs best for both read and write. Again, many
smaller drives in RAID 10 configuration will perform better than fewer larger drives.
RAID Level 4
RAID level 4 requires at least three disk drives and consists in reserving an entire drive for
parity. This parity drive stores the result of the byte-by-byte calculation (XOR) performed
on the other drives in the array. This calculation is such that if any of the drives in the
array fails, its data can be recreated (recalculated) from the other drives in the array.
Writing a block of data to a drive requires having the original block of data and the
corresponding block from the parity drive in memory (or reading it), calculating the new
parity block, and then writing both blocks to the corresponding drives. Conclusion,
writing performs poorly because it can involve reading two drives and always writes to
two drives. Extra time is also required between the read and write to calculate the parity.
RAID Level 5
RAID level 5 consists of striping with distributed parity. RAID 5 improves on RAID 4 by
spreading the parity information much like striping. In a three drive array, one third of the
parity information is on the first drive, the next third is on the second drive and the last
third of the parity information on the third drive. In a three drive array, writing one block
of data results in two of three drives being used. This results in a 33% increase in disk
activity. Similarly, in a four drive array, a write performance impact of 25% would be
realized, etc. RAID 5 performs better when many drives are in the array. This spreads the
"write" impact across more drives.
A discussion of
RAID 10 versus RAID 5
RAID 5 performs well when reading data from the array and offers good redundancy but
performs poorly when writing data. RAID 5 requires possibly two reads, calculations, and
two writes for each application write request. Remembering that all of these reads and
writes means that the drives are not available to other applications, the impact can be quite
severe for applications writing a lot of data. This is the case when Andar creates large
mailing lists or executes data mining operations. These functions tend to write a lot of
data and severely impacts performance of other online users of Andar.
RAID 10 performs better than RAID 5 overall but it is more expensive. When writing
data, only the two drives of the mirrored pairs are involved. No extra reads nor
calculations are required. The mirrored pair of drives behave like a single drive so the fact
that both drives of the pair need to be written to does not impact the other drives'
performance nor does it significantly differ from the un-RAIDed disk drive write
performance.
When writing, RAID 10 always outperforms RAID 5.
When reading, RAID 10 performs at least as well as RAID 5. If the
controller card supports concurrent reads from the mirrored pairs, RAID 10 also outperforms
RAID 5 while reading.
References
msdn.microsoft.com/library
IT Toolbox, Database Knowledge Base