[Techtalk] ide problem?

Maria Pinjanainen maria at tietonoita.fi
Tue Feb 3 14:01:25 UTC 2009


On Tue, 2009-02-03 at 13:22 +0000, Chris Wilson wrote:
> Hi Maria,
> 
> On Tue, 3 Feb 2009, Maria Pinjanainen wrote:
> 
> > The md4 has this troubles. At least the smartd tells it.
> >
> > Device: /dev/hda, 2 Currently unreadable (pending) sectors
> > Device: /dev/hda, 2 Offline uncorrectable sectors
> >
> > From the syslog. Etch start the cron and make it nice until the last 
> > arrays. It ends to kernel panic.
> >
> > The md4 device is only a data device. So, why it is going to kill whole 
> > system? Or is there any other hardware or software trouble?
> 
> It should not actually panic and crash the whole system. I'd need to see 
> the details of the panic to be sure what is happening there.

I had no luck to see that screen that moment. It was my friend who saw
it. 
She told that there was a kernel panic, a call trace, stack pointer dump
and a code dump.

> However, the disk is busy trying to access the bad sector. This means 
> that it cannot do anything else, e.g. read other parts of the disk that 
> might be required for normal operation, while this is happening. Some 
> disks will try for a very long time before giving up. Linux is probably 
> also retrying the reads which makes it worse.
> 
> What is /dev/md4 for? Is it a swap partition? That could cause the system 
> to crash if there are bugs in the RAID driver and it stops answering 
> requests. I'd recommend against using swap on RAID.

No, no, it is not swap. :-) 
That device is only for data. Not any system folders. It is HOME for
several users. But that time, when it happens was silence. No one was
writing there anything.

> I would recommend replacing the hard disk, however it MIGHT be possible to 
> prolong its life by manually rewriting the bad sectors. To do this, remove 
> it from /dev/md4 and add it back again. Otherwise, I'd recommend that you 
> remove it until you can replace the hard disk.

Hmm... you mean, that remove from the Raid array first and then replace
the drive?

> I would not disable checkarray, as it protects you from much more serious 
> problems where you have undetected or unrepaired bad or unreadable sectors 
> in both drives and therefore cannot recover the array after a disk 
> failure.
> 
> Cheers, Chris.

-m-



More information about the Techtalk mailing list