Tuesday, July 17, 2007

What is RAID and what are its different types?

Reply 1)

RAID: Redundant Array of Independent Disks

The abbreviation for the RAID is Redundant Array of Independent Disks. This RAID by definition stands for the subsystem for the disks. The expectation for using this is the increment in the performance and the value added services in the reliability. The major purpose of the system is to provide the fault tolerance subsystem which can provide efficiency and reliability to the overall performance of the system. The RAID is also used as a server for the reasons mentioned above. The RAID in the earlier history is also implemented by the software to enable the present abilities.

Since the purpose of RAID is for fault tolerant systems hence the design is suited for that purpose. The RAID technology is actually a set of standards. These standards are required to be followed for developing a fault tolerant storage system. The performance also matters a lot here. Hence in the mentioned above paragraph it is said that RAID has been implemented by only the software. The set of standards should be kept in mind before implementing the RAID. This is done using at least two ordinary hard disks and a RAID controller.

The RAID has its origin starting from the year of 1980. At that time it was referred as the Redundant Array of Inexpensive Disks. This was in comparison with the storage system available at that time. The storage devices where quite expensive those days; so the implementation of a secure RAID drives was an important enhancement in the field of storage systems. Presently the prices of the memory whether it is the secondary memory like the hard disk, floppy drive, compact disk or any other storage media as well as the primary memory storage like the RAM etc, are all decreasing day by day. Hence by these statistics the RAID Advisory board modified the parameters from inexpensive to the independent.

The concept of mirroring and parity is also available in the RAID drives. In fact the property of fault tolerance is achieved by the process of mirroring and fault tolerance. The achievement is quite necessary for the purpose of providing a fault tolerant system.

The RAID system may have an altogether different drive for the sole purpose of replacing the drive that is failed or might have crashed. The RAID is drive that is replaced and is in spare is called as the hot spare. The hot drive is used in the case of an emergency where in the drive is the spare part that is used to fill in the gap provided by the crashed systems drive. Such a drive must always be ready and waiting. The physical state of such a drive is of quite importance where in the drive must be made available for the purpose of providing back up to the system. The replacement should be carried out immediately. So after the replacement is carried out now the entire system must be made aware of the fact that the hot spare drive is in use. And also the provision should be made for filling up the gap made by the hot spare drive. This is necessary if the other drives also fail and the condition is also possible even if the hot spare drive itself fails. But the RAID continues to dominate the technology that is used for the implantation of secure systems.


The different types of RAID levels are RAID 0, RAID 1, RAID 3, RAID 5 & RAID 10 levels.

RAID 0, STRIPING:
In this system, the data which is to be written across the drivers are split up in blocks of array.
RAID 0 will offers a superior Input Output performance and the performance can be increased further by using multiple controllers. The advantage of using RAID O is that it offers great performance such as read and writes operations. The Disadvantage of RAID 0 is not fault tolerant.

For Example: If at all the data in one of the disk is lost then all the data in the RAID 0 array will be lost. RAID 0 is designed for non critical storage of data where read and write are at a high speed. For example, it can be used in the Photoshop image retouching station.

RAID 1, MIRRORING:
In RAID 1, the data is stored twice on the data disk and on a mirror disk. If one of the disks fails, the controller uses the data drive or the mirror drive for data recovery. The advantages of using RAID 1 are excellent read speed and a write speed which is very high comparable to that compared to a single disk. If one of the disks fails, data is copied to the replacement disk. RAID 1 is a very simple technology compared to RAID O. The disadvantages of RAID 1 are that the storage capacity is half of the total disk capacity which is present in the system because all data get written twice. RAID 1 is ideally suited for mission critical storage. It is also suitable for small servers.

RAID 3:
In RAID 3 systems, the data blocks are divided into and are written in parallel on two or more drives. The additional drive which is used to stores parity information. Since parity is used in RAID 3 stripe set can handle a single disk failure without losing data. The advantages of RAID 3 are to provide high throughput for large data transfers. The disadvantage of RAID 3 is complex and performance is slower for small Input Output operations.

RAID 5:
RAID 5 is the most common used RAID level.
It is somewhat similar to RAID-3 in which data is transferred to disks by independent read and write operations. RAID 5 arrays can withstand a single disk failure as in RAID 3, without losing data. Extra cache memory can be provided in order to improve the write performance. The advantage of RAID 5 is it reads data transactions are very fast. The disadvantage of RAID 5 is disk failures and this is complex technology.

RAID 10, a mix of RAID 0 and RAID 1:
RAID 10 uses the advantages of RAID 0 and RAID 1 in a single system. Its added advantage helps in proving good security by mirroring all data on a secondary set of disks. The RAID 2, 4, 6 or 7 levels do exist in prepress environments. The advantages of RAID 10 are read data transactions are very fast & it is a very simple technology. The disadvantages of RAID 10 are that is its performance is slower for large transfers.

Reply 2)

RAID stands for Redundant Array of Independent (or Inexpensive) Disks,

There are number of different RAID levels:
Level 0:
Level 0 is a 'striped' disk array without fault tolerance. It provides data striping (spreading out blocks of each file across multiple disk drives) but no redundancy.
Level 1:
Level 1 does 'mirroring' and 'duplexing'. It provides disk mirroring

Level 2:
Level 2 does 'error-correcting coding'

Level 3:
Level 3 is 'bit-interleaved parity'. It provides byte-level striping with a dedicated parity disk

Level 4:
Level 4 is 'dedicated parity drive'. It is a commonly used implementation of RAID

Level 5:
Level 5 is 'block interleaved distributed parity'. It provides data striping at the byte level and also stripe error correction information.

Reply 3)

The distribution of data across multiple drives can be managed either by dedicated hardware or by software. Additionally, there are hybrid RAIDs that are partially software and hardware-based solutions.

Software RAID

Software implementations are provided by most operating systems. A software layer sits above the (generally block based) disk device drivers and provides an abstraction layer between the logical drives (RAID arrays) and physical drives. Software RAID is typically limited to RAID 0 (striping across multiple drives for increased space and performance), RAID 1 (mirroring two drives) and RAID 5 (data striping with parity).

In a multi-threaded operating system (such as Linux, FreeBSD, Mac OS X, Windows NT/2000/XP/Vista and Novell NetWare) the operating system can perform overlapped I/O, allowing multiple read or write requests to be initiated without waiting for completion on each request. This capability makes RAID 0/1 possible in an operating system. However, most operating systems do not support RAID 0/1 striping or mirroring with parity, due to the substantial processing demands of calculating parity].

Software implementations require some very small amount of processing time, which is provided by the main CPU in the host system. Since SCSI, PATA, and SATA drives all support asynchronous read/write, any multi-threaded operating system can support non-parity RAID on multiple hard drives with only a one percent increase in CPU overhead[ .

Software implementations can exceed the performance levels of hardware-based RAID due to the high-performance of modern CPUs]. Since the software must run on a host server attached to storage, the processor (as mentioned above) on that host must dedicate processing time to run the RAID software. Like hardware-based RAID, if the server experiences a hardware failure, the attached storage could be inaccessible for a period.

Software implementations can allow RAID arrays to be created from partitions rather than entire physical drives.

Hardware RAID

A hardware implementation of RAID requires at a minimum a special-purpose RAID controller. On a desktop system, this may be a PCI expansion card, or might be a capability built in to the motherboard. In industrial applications the controller and drives are provided as a standalone enclosure. The drives may be IDE/ATA, SATA, SCSI, SSA, Fibre Channel, or any combination thereof. The using system can be directly attached to the controller or, more commonly, connected via a SAN. The controller hardware handles the management of the drives, and performs any parity calculations required by the chosen RAID level.

Most hardware implementations provide a non-volatile read/write cache which, depending on the I/O workload, will improve performance. Cached RAID controllers are most commonly used in industrial applications.

Hardware implementations provide guaranteed performance, add no overhead to the local CPU complex and can support many operating systems, as the controller simply presents a logical disk to the operating system.

Hardware implementations also typically support hot swapping, allowing failed drives to be replaced while the system is running.

Hybrid RAID

Hybrid RAID implementations have become very popular with the introduction of inexpensive RAID controllers, implemented using a standard disk controller and then implementing the RAID in the controllers BIOS extension (for early boot-up/real mode operation) and the operating system driver (for after the system switches to protected mode). Since these controllers actually do all calculations typically proprietary to a given RAID controller manufacturer and typically cannot span multiple controllers. The only advantages over software RAID are that the BIOS can boot from them, and the tighter integration with the device driver may offer better error handling.

Both hardware and software implementations may support the use of hot spare drives, a pre-installed drive which is used to immediately (and almost always automatically) replace a drive that has failed. This reduces the mean time to repair period during which a second drive failure in the same RAID redundancy group can result in loss of data. It also prevents data loss when multiple drives fail in a short period, as is common when all drives in an array have undergone very similar use patterns, and experience wear-out failures

Reply 4)

Great posts everyone. Few queries that come to mind:

1) What is parity?
2) What are the possible ways of connecting a RAID system to the server?
3) Is there a minimum and maximum “number of disk” limit?
4) What is the difference between Disk Mirroring and Disk Duplexing?

Reply 5)

To gain performance and/or additional redundancy the Standard RAID levels( level 0 to level 5 ) can be combined to create hybrid or Nested RAID levels. Many storage controllers allow RAID levels to be nested. That is, one RAID can use another as its basic element, instead of using physical drives

For example, RAID 10 (or RAID 1+0) consists of multiple level 1 arrays stored on physical drives with a level 0 array on top, striped over the level 1 arrays. In the case of RAID 0+1, it is most often called RAID 0+1

Common nested RAID levels
RAID 0+1: Striped Set + Mirrored Set (4 disk minimum; Even number of disks) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if two or more drives fail on different sides of the mirroring, the data on the RAID system is lost.
RAID 1+0: Mirrored Set + Striped Set (4 disk minimum; Even number of disks) provides fault tolerance and improved performance but increases complexity. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives. The array can sustain multiple drive losses as long as no two drives lost comprise a single pair of one mirror.
RAID 5+0: A stripe across distributed parity RAID systems
RAID 5+1: A mirror striped set with distributed parity

Also we can refer to following link for more info on nested array:
http://en.wikipedia.org/wiki/Nested_RAID_levels

Reply 6)

If the storage box is external, it can also have a Fibre interface. Also other option is the create virtual LUNs in SANs and then use RAID.

Reply 7)

1) What is parity?

Parity — Redundant information that is associated with a block of information and used to Rebuild a disk that has failed.

- RAID 5 arrays map data and parity intermittently across a set of disks. Within each stripe, the data on one disk is parity data and the data on the other disks are normal data. Therefore, RAID 5 arrays require at least three disks to allow for this Parity information. When a disk fails, the Array Manager software uses the parity Information in those stripes in conjunction with the data on the other disks to re-create the data on the failed disk.


2) What are the possible ways of connecting a RAID system to the server?

Possible ways of connecting RAID system is SCSI.
SCSI — Acronym for small computer system interface, which is a type of interface between a system and devices such as hard drives, diskette drives, CD drives, printers, scanners, and other peripherals.

3) Is there a minimum and maximum “number of disk” limit?


4) What is the difference between Disk Mirroring and Disk Duplexing?

Disk duplexing is a variation of disk mirroring in which each of multiple storage disks has its own SCSI controller. Disk mirroring (also known as RAID-1) is the practice of duplicating data in separate volumes on two hard disks to make storage more fault-tolerant. Mirroring provides data protection in the case of disk failure, because data is constantly updated to both disks. However, since the separate disks rely upon a common controller, access to both copies of data is threatened if the controller fails. Disk duplexing overcomes this problem; the use of redundant controllers enables continued data access as long as one of the controllers continues to function.

This failover method helps to ensure that data access will continue transparently to the user and allows technicians to take the server down to replace the defective controller at a more opportune time, instead of at the moment of failure. The ability to choose when the server comes down can be very advantageous, because -- in accordance with Murphy's Laws of Information Technology (Law of Inconvenient Malfunction) -- a device is likely to fail at the least opportune possible moment. Nevertheless, some experts advocate other systems (such as higher level RAID configurations) that don't require taking the server down to replace defective hardware.

Another benefit of disk duplexing is increased throughput. Using a technique known as a split seek, whichever disk can deliver the requested data more quickly responds. Multiple requests may also be split between the disks for simultaneous processing.

Reply 8)

I think already everyone is aware about the RAID.
Though I would like to add some images which will be more helpful in understanding of RAID Functionality.

RAID 0


RAID 1


Below we are looking at the RAID 1+0 i.e. RAID 10. Please find the exact description for this diagram in anjum’s Email.


RAID 4


RAID 5

RAID 5 divides the data and creates parity information similar to RAID 4, unlike RAID 4 the parity data is written separately across multiple disks.


RAID 6

RAID 6 deploys two parity records to different disk drives (double parity) enabling two simultaneous disk drive failures in the same RAID group to be recovered.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home