[Techtalk] ide problem?
Maria Pinjanainen
maria at tietonoita.fi
Tue Feb 3 13:02:45 UTC 2009
Hi!
I got a problem. It is possible to solve with using other hardware, one
new disk to the raid array, or totally new computer, but I like to know
first what is the problem.
First the computer, Debian Etch intel/amd hardware. 32bit start running
this:
/USR/SBIN/CRON[14992]: (root) CMD ([ -x /usr/share/mdadm/checkarray ] &&
[ $(date +%d) -le 7 ] && /usr/share/mdadm/checkarray --cron --all
--quiet)
There are these arrays:
Personalities : [raid1]
md3 : active raid1 sda9[0] sdb9[1]
145155200 blocks [2/2] [UU]
md2 : active raid1 sda8[0] sdb8[1]
393472 blocks [2/2] [UU]
md1 : active raid1 sda6[0] sdb6[1]
2931712 blocks [2/2] [UU]
md0 : active raid1 sda5[0] sdb5[1]
4883648 blocks [2/2] [UU]
md4 : active raid1 hda1[0] hdc1[1]
488383936 blocks [2/2] [UU]
The md4 has this troubles.
At least the smartd tells it.
Device: /dev/hda, 2 Currently unreadable
(pending) sectors
Device: /dev/hda, 2 Offline uncorrectable
sectors
>From the syslog.
Etch start the cron and make it nice until the last arrays.
It ends to kernel panic.
The md4 device is only a data device. So, why it is going to kill whole
system?
Or is there any other hardware or software trouble?
Feb 1 01:06:01 etch /USR/SBIN/CRON[14992]: (root) CMD
([ -x /usr/share/mdadm/checkarray ] && [ $(date +%d) -le 7 ]
&& /usr/share/mdadm/checkarray --cron --all --quiet)
Feb 1 01:06:01 etch kernel: md: syncing RAID array md0
Feb 1 01:06:01 etch kernel: md: minimum _guaranteed_ reconstruction
speed: 1000 KB/sec/disc.
Feb 1 01:06:01 etch kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Feb 1 01:06:01 etch kernel: md: using 128k window, over a total of
4883648 blocks.
Feb 1 01:06:01 etch kernel: md: delaying resync of md1 until md0 has
finished resync (they share one or more physical units)
Feb 1 01:06:01 etch kernel: md: delaying resync of md2 until md0 has
finished resync (they share one or more physical units)
Feb 1 01:06:01 etch kernel: md: delaying resync of md3 until md2 has
finished resync (they share one or more physical units)
Feb 1 01:06:01 etch kernel: md: delaying resync of md2 until md0 has
finished resync (they share one or more physical units)
Feb 1 01:06:01 etch kernel: md: delaying resync of md1 until md2 has
finished resync (they share one or more physical units)
Feb 1 01:06:01 etch kernel: md: syncing RAID array md4
Feb 1 01:06:01 etch kernel: md: minimum _guaranteed_ reconstruction
speed: 1000 KB/sec/disc.
Feb 1 01:06:01 etch kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Feb 1 01:06:01 etch kernel: md: using 128k window, over a total of
488383936 blocks.
Feb 1 01:06:02 etch mdadm: RebuildStarted event detected on md
device /dev/md0
Feb 1 01:06:02 etch mdadm: RebuildStarted event detected on md
device /dev/md4
Feb 1 01:07:01 etch mdadm: Rebuild40 event detected on md
device /dev/md0
Feb 1 01:07:44 etch kernel: md: md0: sync done.
Feb 1 01:07:44 etch kernel: RAID1 conf printout:
Feb 1 01:07:44 etch kernel: --- wd:2 rd:2
Feb 1 01:07:44 etch kernel: disk 0, wo:0, o:1, dev:sda5
Feb 1 01:07:44 etch kernel: disk 1, wo:0, o:1, dev:sdb5
Feb 1 01:07:44 etch kernel: md: delaying resync of md1 until md2 has
finished resync (they share one or more physical units)
Feb 1 01:07:44 etch kernel: md: syncing RAID array md2
Feb 1 01:07:44 etch kernel: md: minimum _guaranteed_ reconstruction
speed: 1000 KB/sec/disc.
Feb 1 01:07:44 etch kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Feb 1 01:07:44 etch kernel: md: using 128k window, over a total of
393472 blocks.
Feb 1 01:07:44 etch kernel: md: delaying resync of md3 until md2 has
finished resync (they share one or more physical units)
Feb 1 01:07:44 etch mdadm: RebuildStarted event detected on md
device /dev/md2
Feb 1 01:07:44 etch mdadm: RebuildFinished event detected on md
device /dev/md0
Feb 1 01:07:54 etch kernel: md: md2: sync done.
Feb 1 01:07:54 etch kernel: md: delaying resync of md3 until md1 has
finished resync (they share one or more physical units)
Feb 1 01:07:54 etch kernel: md: syncing RAID array md1
Feb 1 01:07:54 etch kernel: md: minimum _guaranteed_ reconstruction
speed: 1000 KB/sec/disc.
Feb 1 01:07:54 etch kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Feb 1 01:07:54 etch kernel: md: using 128k window, over a total of
2931712 blocks.
Feb 1 01:07:54 etch kernel: RAID1 conf printout:
Feb 1 01:07:54 etch kernel: --- wd:2 rd:2
Feb 1 01:07:54 etch kernel: disk 0, wo:0, o:1, dev:sda8
Feb 1 01:07:54 etch kernel: disk 1, wo:0, o:1, dev:sdb8
Feb 1 01:07:54 etch mdadm: RebuildFinished event detected on md
device /dev/md2
Feb 1 01:07:54 etch mdadm: RebuildStarted event detected on md
device /dev/md1
Feb 1 01:08:54 etch mdadm: Rebuild60 event detected on md
device /dev/md1
Feb 1 01:09:09 etch kernel: md: md1: sync done.
Feb 1 01:09:09 etch kernel: md: syncing RAID array md3
Feb 1 01:09:09 etch kernel: md: minimum _guaranteed_ reconstruction
speed: 1000 KB/sec/disc.
Feb 1 01:09:09 etch kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Feb 1 01:09:09 etch kernel: md: using 128k window, over a total of
145155200 blocks.
Feb 1 01:09:09 etch kernel: RAID1 conf printout:
Feb 1 01:09:09 etch kernel: --- wd:2 rd:2
Feb 1 01:09:09 etch kernel: disk 0, wo:0, o:1, dev:sda6
Feb 1 01:09:09 etch kernel: disk 1, wo:0, o:1, dev:sdb6
Feb 1 01:09:09 etch mdadm: RebuildStarted event detected on md
device /dev/md3
Feb 1 01:09:09 etch mdadm: RebuildFinished event detected on md
device /dev/md1
Feb 1 01:19:09 etch mdadm: Rebuild20 event detected on md
device /dev/md3
Feb 1 01:29:09 etch mdadm: Rebuild40 event detected on md
device /dev/md3
Feb 1 01:36:09 etch mdadm: Rebuild20 event detected on md
device /dev/md4
Feb 1 01:49:09 etch mdadm: Rebuild80 event detected on md
device /dev/md3
Feb 1 02:00:11 etch kernel: md: md3: sync done.
Feb 1 02:00:11 etch kernel: RAID1 conf printout:
Feb 1 02:00:11 etch kernel: --- wd:2 rd:2
Feb 1 02:00:11 etch kernel: disk 0, wo:0, o:1, dev:sda9
Feb 1 02:00:11 etch kernel: disk 1, wo:0, o:1, dev:sdb9
Feb 1 02:00:11 etch mdadm: RebuildFinished event detected on md
device /dev/md3
All the other raids are done, but the one is to do.
Feb 1 02:03:11 etch mdadm: Rebuild40 event detected on md
device /dev/md4
Feb 1 02:27:11 etch mdadm: Rebuild60 event detected on md
device /dev/md4Feb 1 02:54:11 etch mdadm: Rebuild80 event detected on
md device /dev/md4
Feb 1 02:54:11 etch mdadm: Rebuild80 event detected on md
device /dev/md4
Feb 1 03:28:09 etch kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 1 03:28:09 etch kernel: hda: dma_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974762815
Feb 1 03:28:09 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:12 etch kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 1 03:28:12 etch kernel: hda: dma_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974762815
Feb 1 03:28:12 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:15 etch kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 1 03:28:15 etch kernel: hda: dma_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974762815
Feb 1 03:28:15 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:18 etch kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Feb 1 03:28:18 etch kernel: hda: dma_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974762815
Feb 1 03:28:18 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:18 etch kernel: hda: DMA disabled
Feb 1 03:28:18 etch kernel: hdb: DMA disabled
The loop starts...
Feb 1 03:28:18 etch kernel: ide0: reset: success
Feb 1 03:28:21 etch kernel: hda: task_in_intr: status=0x59 { DriveReady
SeekComplete DataRequest Error }
Feb 1 03:28:21 etch kernel: hda: task_in_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974763422
Feb 1 03:28:21 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:24 etch kernel: hda: task_in_intr: status=0x59 { DriveReady
SeekComplete DataRequest Error }
Feb 1 03:28:24 etch kernel: hda: task_in_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974763422
Feb 1 03:28:24 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:27 etch kernel: hda: task_in_intr: status=0x59 { DriveReady
SeekComplete DataRequest Error }
Feb 1 03:28:27 etch kernel: hda: task_in_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974763422
Feb 1 03:28:27 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:30 etch kernel: hda: task_in_intr: status=0x59 { DriveReady
SeekComplete DataRequest Error }
Feb 1 03:28:30 etch kernel: hda: task_in_intr: error=0x01
{ AddrMarkNotFound }, LBAsect=974763422, high=58, low=1684894,
sector=974763422
Feb 1 03:28:30 etch kernel: ide: failed opcode was: unknown
Feb 1 03:28:30 etch kernel: ide0: reset: success
... etc... about 10 times... until kernel panic...
At least so far I took the chackarray away from the cron.
--m
More information about the Techtalk
mailing list