The phrase "RAID" is an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. RAID's various designs all involve two key design goals: increased data reliability and increased input/output performance. When several physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across several disks, but the array is seen by the computer user and operating system as one single disk. RAID can be set up to serve several different purposes, the most common of which are outlined below. A RAID distributes data across several physical disks which look to the operating system and the user like a single disk. Several different arrangements are possible. We assume here that all the disks are of the same capacity, as is usual. Some arrays are "redundant" in a way that writes extra data derived from the original data across the array organized so that the failure of one (sometimes more) disks in the array will not result in loss of data; the bad disk is replaced by a new one, and the data on it reconstructed from the remaining data and the extra data. A redundant array obviously allows less data to be stored; a 2-disk RAID 1 array loses half of its capacity, and a RAID 5 array with several disks loses the capacity of one disk. Other RAIDs are arranged in a way that makes them faster to write to and read from than a single disk. There are various combinations of these approaches giving different trade offs of protection against data loss, capacity, and speed. RAID levels 0, 1, and 5 are the most commonly found, and cover most requirements. RAID principles RAID combines two or more physical hard disks into a single logical unit by using either special hardware or software. Hardware solutions often are designed to present themselves to the attached system as a single hard drive, and the operating system is unaware of the technical workings. Software solutions are typically implemented in the operating system, and again would present the RAID drive as a single drive to applications. There are three key concepts in RAID: mirroring, the copying of data to more than one disk; striping, the splitting of data across more than one disk; and error correction, where redundant data is stored to allow problems to be detected and possibly fixed (known as fault tolerance). Different RAID levels use one or more of these techniques, depending on the system requirements. The main aims of using RAID are to improve reliability, important for protecting information that is critical to a business, for example a database of customer orders; or where speed is important, for example a system that delivers video on demand TV programs to many viewers. The configuration affects reliability and performance in different ways. The problem with using more disks is that it is more likely that one will go wrong, but by using error checking the total system can be made more reliable by being able to survive and repair the failure. Basic mirroring can speed up reading data as a system can read different data from both the disks, but it may be slow for writing if the configuration requires that both disks must confirm that the data is correctly written. Striping is often used for performance, where it allows sequences of data to be read from multiple disks at the same time. Error checking typically will slow the system down as data needs to be read from several places and compared. The design of RAID systems is therefore a compromise and understanding the requirements of a system is important. Modern disk arrays typically provide the facility to select the appropriate RAID configuration. Standard levels | Level | Description | Minimum # of disks | Image | | RAID 0 | Striped set without parity/[Non-Redundant Array]. Provides improved performance and additional storage but no fault tolerance. Any disk failure destroys the array, which becomes more likely with more disks in the array. A single disk failure destroys the entire array because when data is written to a RAID 0 drive, the data is broken into fragments. The number of fragments is dictated by the number of disks in the drive. The fragments are written to their respective disks simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the drive in parallel, giving this type of arrangement huge bandwidth. RAID 0 does not implement error checking so any error is unrecoverable. More disks in the array means higher bandwidth, but greater risk of data loss. SNIA definition. | 2 |  | | RAID 1 | Mirrored set without parity. Provides fault tolerance from disk errors and failure of all but one of the drives. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, very small performance reduction when writing. Array continues to operate so long as at least one drive is functioning. SNIA definition. Using RAID 1 with a separate controller for each disk is sometimes called duplexing. | 2 |  | | RAID 2 | Redundancy through Hamming code. Disks are synchronised and striped in very small stripes, often in single bytes/words. Hamming codes error correction is calculated across corresponding bits on disks, and is stored on multiple parity disks. | 3 | | | RAID 3 | Striped set with dedicated parity/Bit interleaved parity. This mechanism provides an improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk rather than rotated parity stripes. The single parity disk is a bottle-neck for writing since every write requires updating the parity data. One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty. | 3 |  | | RAID 4 | Block level parity. Identical to RAID 3 but does block-level striping instead of byte-level striping. In this size of each stripe is large i.e. Block. Now a single file can be stored in a block, In this each disk operates independently and many different I/O request can be satisfied in parallel. But data transfer speed for each I/O request will be less. The error detection is done by block level parity and is stored in separate single disk unit. | 3 |  | | RAID 5 | Striped set with distributed parity. Distributed parity requires all drives but one to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive. | 3 |  | | RAID 6 | Striped set with dual distributed parity. Provides fault tolerance from two drive failures; array continues to operate with up to two failed drives. This makes larger RAID groups more practical, especially for high availability systems. This becomes increasingly important because large-capacity drives lengthen the time needed to recover from the failure of a single drive. Single parity RAID levels are vulnerable to data loss until the failed drive is rebuilt: the larger the drive, the longer the rebuild will take. Dual parity gives time to rebuild the array without the data being at risk if one drive, but no more, fails before the rebuild is complete. | 4 |  | Nested levels Many storage controllers allow RAID levels to be nested: the elements of a RAID may be either individual disks or RAIDs themselves. Nesting more than two deep is unusual. As there is no basic RAID level numbered larger than 9, nested RAIDs are usually unambiguously described by concatenating the numbers indicating the RAID levels, sometimes with a "+" in between. For example, RAID 10 (or RAID 1+0) consists of several level 1 arrays of physical drives, each of which is one of the "drives" of a level 0 array striped over the level 1 arrays. RAID 0+1 is not called RAID 01, to avoid confusion with RAID 1. When the top array is a RAID 0 (such as in RAID 10 and RAID 50) most vendors omit the "+", though RAID 5+0 is clearer. - RAID 0+1: striped sets in a mirrored set (minimum four disks; even number of disks) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if drives fail on both sides of the mirror the data on the RAID system is lost.
- RAID 1+0: mirrored sets in a striped set (minimum four disks; even number of disks) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. In a failed disk situation RAID 1+0 performs better because all the remaining disks continue to be used. The array can sustain multiple drive losses so long as no mirror loses both its drives.
- RAID 5+0: stripe across distributed parity RAID systems.
- RAID 5+1: mirror striped set with distributed parity (some manufacturers label this as RAID 53).
From Wikipedia, the free encyclopedia (click for full article)
|