[Techtalk] RAID data recovery w/ hardware controller

Rudy Zijlstra rudy at grumpydevil.homelinux.org
Wed Feb 13 18:19:21 UTC 2008



On Wed, 13 Feb 2008, Gina Feichtinger wrote:

> On 12.02.2008 22:22 Uhr, Rudy Zijlstra wrote:
>> On Tue, 12 Feb 2008, Carla Schroder wrote:
> [...]
>>> I'm figuring out
>>> the pros and cons with Linux software RAID; with Linux RAID you can stuff
>>> your drives in any Linux box and re-construct the array, assuming the drives
>>> are not damaged and the data not corrupted. Very nice for moving to a new
>>> box, or recovering from some other component failure that doesn't affect the
>>> hard drives.
>>
>> I use both HW raid and SW raid. Generally speaking i'm using HW raid on
>> boxes with high availability requirements, and use them for system
>> partition. big storage RAID is then SW raid. Reasons to use HW raid:
>> -1- easy identification of failed disk
>> -2- no problems to boot from a raid partition
>> -3- configuration / recovery can be done with no OS active
>>
>> Examples: i have a IBM x-series server with 4 SCSI HDD. i tried to use SW
>> raid on that, and in order to have a simple raid config, used partitioned
>> raid driver. I subsequently ended up creating a specific boot CD, as no
>> disk based bootloader supported this configuration... Both grub and lilo
>> barfed on it :(
>
> May I ask what you mean with "partitioned raid driver"? I've had a
> x-Series machine barf on me after install (a x3650 with RHEL AS to be
> precise) only because in that case grub wasn't properly written to the
> MBR...

Linux has two SW raid drivers. One which you cannot partition the array, 
and one where you can parition the raid array like any other disk. The 
latter is the most usefull IMHO, and not supported by either grub or 
lilo....
At least not half a year back when i last tried.

>
> A thing I found useful about SW RAID - switching to bigger disks in a
> RAID-1 setup is pretty easy. Split the mirror, switch second disk,
> rebuild RAID, switch first disk, rebuild RAID, done (OKOK, more or
> less). I've had to do that a few times already when the internal 36GB
> disks were getting too small.

No experience on this... And it needs more work anyway, and is also 
depending on using a filesystem you can grow. The HW raid can actually do 
that as well. Its a bit more difficult as my raids tend to be RAID5, but 
its do-able.

>
>> When i had the opportunity to get some modern SCSI-320 raid controllers
>> for a sensible price i ran for it... much easier to maintain solution.
>>
>> Also, i've found the beeping that ensues on a disk failure to be a very
>> usefull warning sign :)
>>
>> On SW raid, identifying which HDD has failed can be an issue. I've once
>> lost a 1.5T array because we mis-identified which HDD had failed.... All
>> data on the array was lost.
>
> May I ask how this happened? Ususally "cat /proc/mdstat" gives you a
> good overview about the state of the array.

True, very true... But then, which cable represents which SATA port? 
sometimes the port numbering on the SATA PCI cards is not exactly 
logical... which caused the above mentioned mistake.

This card turned out to have the following numbering scheme: 1 3 4 2 (or 
something simmilar). A later version from the same manufacturer has a 
different scheme.

>
>> On the HW raid, when i accidentally disconnect the wrong disk, i can force
>> that disk online, and incur no data loss. I've so far found no way to do
>> that on SW raid... can be my problem though, as mdadm still has some
>> secrets for me.
>
> mdadm is a bit, how should I say, cryptic at times. What I have come to
> like a lot is the monitoring feature, though. Whenever a disk or
> partition craps out I get a mail and can react pretty quickly. Since I
> don't have to visit the server room that often anymore it could take way
> longer for me to see/hear a failed disk in a HW RAID I think.

Since both are capable of sending out emails, i did not mention that. With 
the setup i have the beeping is a very irritating noise which will be 
noticed :) I agree it has a lot to do with the physical setup though.

Cheers,

Rudy


More information about the Techtalk mailing list