ATA RAID
RAID is an acronym for redundant array of independent (or inexpensive) disks and was designed to improve the fault tolerance and performance of computer storage systems.
RAID was first developed at the University of California at Berkeley in 1987, and was designed so that a group of smaller, less expensive drives could be interconnected with special hardware and software to make them appear as a single larger drive to the system.
By using multiple drives to act as one drive, increases in fault tolerance and performance could be realized. Initially, RAID was conceived to simply enable all the individual drives in the array to work together as a single, larger drive with the combined storage space of all the individual drives added up.
However, this actually reduced reliability and didn't do much for performance, either. For example, if you had four drives connected in an array acting as one drive, you would be four times as likely to experience a drive failure than if you used just a single larger drive.
To improve the reliability and performance, the Berkeley scientists proposed six levels (corresponding to different methods) of RAID. These levels provide varying emphasis on either fault tolerance (reliability), storage capacity, performance, or a combination of the three.
An organization called the RAID Advisory Board (RAB) was formed in July 1992 to standardize, classify, and educate on the subject of RAID. The RAB has developed specifications for RAID, a conformance program for the various RAID levels, and a classification program for RAID hardware.
Currently, seven standard RAID levels are defined by the RAID Advisory Board, called RAID 0–6. RAID typically is implemented by a RAID controller board, although software-only implementations are possible (but not recommended). The levels are as follows:
-
RAID Level 0—Striping. File data is written simultaneously to multiple drives in the array, which act as a single larger drive. Offers high read/write performance but very low reliability. Requires a minimum of two drives to implement.
-
RAID Level 1—Mirroring. Data written to one drive is duplicated on another, providing excellent fault tolerance (if one drive fails, the other is used and no data lost), but no real increase in performance as compared to a single drive. Requires a minimum of two drives to implement (same capacity as one drive).
-
RAID Level 2—Bit-level ECC. Data is split one bit at a time across multiple drives, and error correction codes (ECCs) are written to other drives. Intended for storage devices that do not incorporate ECC internally (all SCSI and ATA drives have internal ECC). Provides high data rates with good fault tolerance, but large numbers of drives are required, and no commercial RAID 2 controllers or drives without ECC, that I am aware of, are available on the market.
-
RAID Level 3—Striped with parity. Combines RAID Level 0 striping with an additional drive used for parity information. This RAID level is really an adaptation of RAID Level 0 that sacrifices some capacity, for the same number of drives. However, it also achieves a high level of data integrity or fault tolerance because data usually can be rebuilt if one drive fails. Requires a minimum of three drives to implement (two or more for data and one for parity).
-
RAID Level 4—Blocked data with parity. Similar to RAID 3 except data is written in larger blocks to the independent drives, offering faster read performance with larger files. Requires a minimum of three drives to implement (two or more for data and one for parity).
-
RAID Level 5—Blocked data with distributed parity. Similar to RAID 4 but offers improved performance by distributing the parity stripes over a series of hard drives. Requires a minimum of three drives to implement (two or more for data and one for parity).
-
RAID Level 6—Blocked data with double distributed parity. Similar to RAID 5 except parity information is written twice using two different parity schemes to provide even better fault tolerance in case of multiple drive failures. Requires a minimum of four drives to implement (two or more for data and two for parity).
Additional RAID levels exist that are not supported by the RAID Advisory Board but which are instead custom implementations by specific companies. Note that a higher number doesn't necessarily mean increased performance or fault tolerance; the numbered order of the RAID levels was entirely arbitrary.
At one time virtually all RAID controllers were SCSI based, meaning they used SCSI drives. For a professional setup, SCSI RAID is definitely the best choice because it combines the advantages of RAID with the advantages of SCSI—an interface that already was designed to support multiple drives.
Now, however, ATA RAID controllers are available that allow for even less expensive RAID implementations. These ATA RAID controllers typically are used in single-user systems for performance rather than reliability increases.
Most ATA RAID implementations are much simpler than the professional SCSI RAID adapters used on network file servers. ATA RAID is designed more for the individual who is seeking performance or simple drive mirroring for redundancy.
When set up for performance, ATA RAID adapters run RAID Level 0, which incorporates data striping. Unfortunately, RAID 0 also sacrifices reliability such that if one drive fails, all data is lost. With RAID 0, performance scales up with the number of drives you add in the array.
If you use four drives, you won't necessarily have four times the performance of a single drive, but it can be close to that for sustained transfers. Some overhead is still involved in the controller performing the striping and issues still exist with latency—that is, how long it takes to find the data—but performance will be higher than any single drive can normally achieve.
When set up for reliability, ATA RAID adapters generally run RAID Level 1, which is simple drive mirroring. All data written to one drive is written to the other. If one drive fails, the system can continue to work on the other drive.
Unfortunately, this does not increase performance at all, and it also means you get to use only half of the available drive capacity. In other words, you must install two drives, but you get to use only one (the other is the mirror). However, in an era of high capacities and low drive prices, this is not a significant issue.
For example, you can create an 80GB RAID Level 1 array (two 80GB drives) for about $200 ($100 per drive) if your motherboard also includes an ATA RAID adapter. If you need to purchase a separate ATA RAID adapter, you can do so for less than $100. If you want to eliminate a lot of bulky cabling, consider Serial ATA RAID, which uses the narrow Serial ATA cables.
Combining performance with fault tolerance requires using one of the other RAID levels, such as 3 or 5. For example, virtually all professional RAID controllers used in network file servers are designed to use RAID Level 5. Controllers that implement RAID Level 5 are more expensive, and at least three drives must be connected.
To improve reliability, but at a lower cost, many of the ATA RAID controllers enable combinations of the RAID levels—such as 0 and 1 combined. This usually requires four drives, two of which are striped together in a RAID Level 0 arrangement, which is then redundantly written to a second set of two drives in a RAID Level 1 arrangement.
This enables you to have approximately double the performance of a single drive, and you have a backup set should one of the primary sets fail. Today, you can get ATA RAID controllers from companies such as Arco Computer Products, Iwill, Promise Technology, HighPoint, and more.
A typical example of a low-cost ATA RAID controller is the Promise FastTrak 100/TX2. This controller enables up to four drives to be attached, and you can run them in RAID Level 0, 1, or 0+1 mode. This card has only two channels, however, so performance isn't as good as it would be if each drive were on a separate cable.
This is because only one drive can transfer on the cable at a time, which cuts performance in half. Promise also made a four-channel ATA RAID card called the Promise FastTrack 100/TX4, but it has been discontinued in favor of a four-channel Serial ATA RAID card called the Promise SATA150/TX4.
Both cards use a separate ATA (SATA) data channel (cable) for each drive, allowing maximum performance. I recommend four-channel ATA or Serial ATA RAID cards for best performance. If you are looking for an ATA RAID controller (or a motherboard with an integrated ATA RAID controller), things to look for include:
-
RAID levels supported (most support 0, 1, and 0+1 combined, although some ATA RAID 5 card products are now available)
-
Two or four channels (I recommend four channels for best performance)
-
Support for ATA/100 or ATA/133 speeds (if you don't use Serial ATA)
-
Support for 33MHz or 66MHz PCI slots
-
ATA or Serial ATA (Serial ATA is a faster interface, but Serial ATA drives are unlikely to be as common as ATA drives until sometime in 2004)
If you want to experiment with RAID inexpensively, you can implement RAID without a custom controller when using certain higher-end (often server-based) operating systems. For example, Windows NT/2000 and XP or Server 2003 operating systems provide a software implementation for RAID using both striping and mirroring.
In these operating systems, the Disk Administrator tool is used to set up and control the RAID functions, as well as to reconstruct the volume when a failure has occurred. Normally, though, if you are building a server in which the ultimate in performance and reliability is desired, you should look for ATA or SCSI RAID controllers that support RAID Level 3 or 5.