[Techtalk] Some questions about bootup, fsck and sulogin.

Magni Onsoien magnio+lc-techtalk at pvv.ntnu.no
Mon Jan 19 12:03:50 EST 2004


I run some Linux-servers that from time to time crash. When they boot up
again, the file systems often need a manual fsck, so sulogin is invoked
from /etc/rc.local or some other init-script. This is inconvenient,
since the servers are located between 10 and 500km from us, and we thus
has to get out a technician (or a non-techie...) to type in the root
password and then run fsck manually on the filesystem. Often the problem
was just that fsck detected an error that required a reboot (return code
2) so the new fsck-run returns 0 and everything is fine.

My first problem is to get rid of the need for root password when these
things happens. In /etc/rc.sysinit (RedHat) and
/etc/init.d/(checkfs.sh,checkroot.sh) (Debian) sulogin is invoked. Could
I just substitute the sulogin-command with /bin/bash or would that have
consequences I haven't thought about? I don't have a dedicated test
server right no, so I'd rather not test this right now. 

(I know it won't prompt for password then, and that's the point, yes.
The servers are locked in and if someone gets access to the console they
can do anything anyway.)

My second problem is the behaviour of fsck and the startup scripts.
Now fsck will make the bootup process drop into a shell if return code >
1 is returned, i.e. if an error that couldn't be fixed _or_ that require
a reboot (return code 2) is found. The behaviour I would like is to
check / and /var (if it's an own file system), and if those two were ok
just continue bootup no matter what return codes fsck got for the other
filesystems. If the return code for the other file systems were > 1,
they shouldn't be mounted and it should be logged to syslog what
occured (so I can see it when it comes up).

The reason for this humble wish is of course the off-site location of
the servers, and that we often don't have competent staff there - and
I'd also prefer to fix the filesystems _myself_ rather than having another
person fixing it. A solution I often use now is to help the person
on-site to log in (reading the root password) and then help him/her
commenting out all filesystems except / and /var from /etc/fstab and
then reboot again. (/var is needed for logfiles and statusfiles, but I
guess I could make the same directory structure on / for use in such
emergencies.)

Has anyone done anything of this before? What experiences do you have
with it? Any other tips for solving my problems? Thanks for any help!



Magni :)
-- 
sash is very good for you.


More information about the Techtalk mailing list