Selecting Linux RAID

My project last weekend was to build a Linux storage server for my network. Sunday, I discussed benchmarking SATA controllers under Linux, and my discovery that a cheap SATA controller actually performed a smidge better than the one built into my system board. The add-on controller gave me enough ports to proceed and build the RAID. Today, I discuss my considerations for building the RAID.

As I mentioned previously, I had a pair of server-grade Seagate ST3500631NS 500GB SATA disk drives to use. My goal was to build the storage server using the Linux software RAID capability.

My performance needs were modest. My home network is primarily wireless. Even if I do go wired, which is my eventual plan, it doesn't take a lot of disk performance to keep an Ethernet connection filled up.

Therefore, my priorities, in order, were: reliability, cost, and performance as a distant last.

As quick review, Redundant Array of Inexpensive Disks (RAID) is a technique for arranging multiple hard disk drives for either increased performance, increased reliability, or a combination of both.

The striped configuration (called RAID level 0) represents one extreme. It's the fastest method for writing data, but also the least reliable.

Striping is fast because the data are split across multiple disks, all working at once. It's like having two mail carriers for a neighborhood: splitting a given amount of mail across several carriers gets it delivered faster.

Striping is the least reliable because every disk in the array has a bit of the data. If one disk fails, the entire filesystem is lost.

The mirrored configuration (called RAID level 1) represents the other extreme. It's the most reliable mechanism, but it's the slowest method for writing data.

On a mirrored array, one copy of the same data is written to each disk of the array. That means if one hard disk goes bad, you still have copies on the other disks on the array.

It also means that when data are being written, there are redundant copies being made, which slows things down.

There are other RAID levels and configurations which achieve better balances of performance and reliability. They were not suitable for my situation because they all require more than two disk drives.

Given my emphasis on reliability over performance, the mirrored configuration seemed like the best match for my needs. My concern was whether the performance penalty would be significant.

I ran some benchmark tests to answer that question. I'll write about those tests next time.

(This article is part two of a three part series. Part three concludes here.)

Comments

Comments have been closed for this entry.

You can get both mirroring and striping with 2 disks

It is possible with Linux MD raid to get both mirroring and striping with 2 disks, via the raid10 far (f2) layout. Thus for reads you get almost full performance out of your disks, like for RAID0. For writing you get performance almost as for one disk, as you need to write data to both disks. The elevator algoritm with the filesystem tends to optimise output here.

More on linux raid performance can be found at http://linux-raid.osdl.org/index.php/Performance