[Techtalk] raid array problems

Cynthia Kiser cnk at ugcs.caltech.edu
Thu Dec 29 16:51:42 EST 2005


OK so the computer where the RAID is mounted boots from local disk but
has a SCSI RAID card that seems to have 1 large array - #0. The bad
sector errors on boot seem to indicate you have a bad disk - and bad
in a way that doesn't let the RAID cover for it. Do you have any idea
what the RAID configuration was meant to be? Was this one big RAID 5
or RAID 10 array? Both of those should have been robust to a single
disk failure, so the fact you are seeing errors at the OS level makes
me wonder if you lost 2 disks.

Poke around in the configuration utility. You should be able to
inspect the setup without changing it. See if it will tell you what
the configuration was and what disk or disks are showing errors. That
will tell you what you will need in terms of replacement disks. Who
knows, you might find it was configured with a cold spare already in
the machine. 

However, I would also start looking to see what backups you have of
everyone's home directories. It is possible that once you have
replaced the bad disks, you can rebuild the array without losing
data. But I fear that between the first problem where the disk came
back up after reboot and your current set of problems, you have lost
the redundency that would allow for that recovery. But post back what
you find from the configuration and see if there is a path from where
you are now to full home directory access that does not involve
rebuilding your array and then restoring the data from backup.


Quoting Maria McKinley <maria at shadlen.org>:
> It is a hardware array. When I boot up I can go into the adaptec raid 
> configuration utility, and it gives me an option to use the disk 
> utility. I am given one cnannel to select #0. Should I try it? There
> are also options to use a SCSISelect Utility and an Array Configuration 
> Utility.
> 
> When I continue with boot, it tells me
> SCSI disk error : host 1 channel 0 id 0 lun 0 return code = 1
> I/O error: dev 08:00, sector 6
> read_callback: read failed, status = 5
> 
> It repeats this same error, with the only thing changing the sector 
> (sector 6,0,2,4 all have error)
> 
> Then it says
> unable to read partition table
> fsck.ext3: No such device or address while trying to open /dev/sda1
> Possibly non-existent or swap device?
> /dev/hdc: clean, 181981/7340032 files, 11664506/14653918 blocks
> 
> fsck failed. Please repair manually.
> 
> I'm not sure that all of this is to do with the raid array, but it seems 
> likely. It then gives me an oportunity to do maintenance or CONTROL-D to 
> continue. Then computer boots up normally, except raid array is not mounted.
> 
> -maria

-- 
Cynthia N. Kiser
cnk at ugcs.caltech.edu


More information about the Techtalk mailing list