[Techtalk] Is Linux 2.4.18 Really That Unstable?

Julie txjulie at austin.rr.com
Thu Oct 17 10:12:40 EST 2002


caitlynmaire at earthlink.net wrote:
> 
> Hi, Julie, and everyone else,
> 
> I've had very good results with both RH 8.0 (2.4.18-14 kernel) and RH 7.3
> (upgraded to 2.4.18-10 kernel) on our machines at work and mine at home.  We
> use ext3 across the board on various different hardware configurations, from
> well equipped servers to IBM Thinkpad laptops.
> 
> Here's a possibility I have seen and I have not seen discussed:  I have had
> problems in the past when I have heavily upgraded assorted packages including
> the kernel under older Red Hat distributions.  You mentioned you started with
> a 2.4.2-2 kernel, which sounds like RH 7.1.  You've done a number of kernel
> upgrades with non-RH kernels, added a new filesystem, etc... I've had very
> flaky results doing what you've done.

Thanks.  This sounds like the most likely cause.  Everything is
very peachy once I go back to 2.4.2-2smp (oh, did I mention this is
an SMP?), but gets flakier the further I get from 2.4.2-2.  I did
have a nice time running 2.4.7 (I think), but that was also before
ext3, so I'm inclined to blame it on ext3.

I have run memtest for a while, but as this thing has 1GB RAM
memtest is just =slow=.  And I can't leave it off-line for day
to let it run (it takes half an hour to get 1/3rd of the way into
test #4) as it is the firewall, WINS server, DHCP server, SMB
server, NFS server etc. for the entire house.  It would take
less time to do an upgrade install to 7.3 or something more
recent.

> IME it is much better to upgrade the entire distro.  If you can, try
> rebuilding (not in-place upgrading) with RH 8.0.  I'd bet your problems
> disappear.

Explain.  This is a large machine (240GB disk) and I can't just
back it up and re-install all my apps and data.

One thing I've noticed, which might be related somehow, is that
CPU 0 stays about 100F and CPU 1 stays about 120F.  I have the
cutoff set at 160F and have never seen either CPU above 130F,
so I don't think it's heat, but that one CPU being warmer than
the other is suspicious.  I'm not overclocking it or anything.
Speaking of overclocking, when I set the bus to 100MHz instead
of 133Mhz it still died.  Added wait states, etc., etc. and the
same thing -- crashes.  It does appear that Linux favours CPU 0
over 1, so perhaps all those NOPs are making it warmer 8-)

FWIW, I have built =many= kernels on this thing and run all
sorts of programs, including multiple VMWares at the same time.
The behavior is 2.4.2-2 runs for weeks and months, 2.4.18+
runs for days.  When it crashes it starts with a random program
dying from a segfault, then the entire machine hangs a short
while later.

Oh -- I also saw problems on an Athalon machine I built running
Red Hat 7.2 with ext3 that I didn't see on the same machine
running Win2K.  The same thing is true with this machine -- it
runs Win2K like a dream.

I have started switching to the console window (CTL-ALT-F0) to
see if I can see a kernel oops, but so far it hasn't crashed
when I've done that.
-- 
Julianne Frances Haugh             Life is either a daring adventure
txjulie at austin.rr.com                  or nothing at all.
					    -- Helen Keller



More information about the Techtalk mailing list