RAID5 is evil
Tom Metro
blu at vl.com
Thu Jan 25 18:26:48 EST 2007
Matthew Gillen wrote:
> I guess you didn't follow some of the links from Monday about why RAID5 is
> evil ;-)
> http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt
Nope, but I've posted in the past about my distaste for higher-order
RAID and LVM due to the way they complicate recovery.
Practically speaking, it comes down to the lesser of several evils. If
cost wasn't an issue, I'd opt for RAID1. RAID5 is the next best
compromise to gain some redundancy while not eating up as much usable space.
The article you quote seems a bit dated, however (just consider the
reference to $1000 disk drives, for example):
Now SCSI controllers reserve several hundred disk blocks to be
remapped to replace fading sectors with unused ones, but if the
drive is going these will not last very long and will run out and
SCSI does NOT report correctable errors back to the OS! Therefore
you will not know the drive is becoming unstable until it is too
late and there are no more replacement sectors and the drive begins
to return garbage.
[Note that the recently popular IDE/ATA drives do not (TMK) include
bad sector remapping in their hardware so garbage is returned that
much sooner.]
Modern IDE/SATA drives that support SMART[1] monitoring will (given you
are running the necessary daemon) notify you when they are starting to
fail, and as implied by the Wikipedia article, such drives also
automatically reallocate sectors.
1.
http://en.wikipedia.org/wiki/Self-Monitoring%2C_Analysis%2C_and_Reporting_Technology
When a drive returns garbage, since RAID5 does not EVER
check parity on read...when you write the garbage sector back
garbage parity will be calculated and your RAID5 integrity is lost!
This seems to ignore the integrity checks performed by the file system
layer. Is it not typical to at minimum have sector checksums in all
modern file systems? The RAID layer may not be aware of the failures,
but your OS will be.
What about that thing about losing a second drive? Well with RAID10
there is no danger unless the one mirror that is recovering also
fails and that's 80% or more less likely than that any other drive
in a RAID5 array will fail!
As another article recently posted to the list points out, that 80%
probability isn't quite right. In practice the probability of a second
drive failure is higher due to the process of recovering stressing the
remaining drive. This is because the rebuild requires reading 100% of
the data on the remaining drive, so any weak sectors will be uncovered.
The article noted that performing regular backups will similarly stress
the drives, and catch the problem before it becomes critical.
The original reason for the RAID2-5 specs was that the high cost of
disks was making RAID1, mirroring, impractical. That is no longer
the case!
Of course all things are relative. If you're storing a modest amount of
data and you have a corporate budget, sure, mirroring is plenty cheap.
If you're trying to store 1 TB on a home server, the cost of mirroring
is prohibitive. Besides, RAID should only be used for increased
reliability (decreased down time), not data integrity, which still
depends on traditional backups.
-Tom
--
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Discuss
mailing list