[techtalk] bind problem...and a sendmail one, too

Sun Nov 14 19:25:39 EST 1999

Excerpts from linuxchix: 14-Nov-99 [techtalk] bind problem...a.. by
Nicole Zimmerman at wsu.edu 
> I don't know what bind's problem is, but here's what I have on it:
>  
> Every one minute, cron runs a job in /tmp (/tmp/ns, the cron listing for
> this job is also in /tmp). Every one minute after cron is unsuccessful,
> it sends an e-mail to root saying: 
>         bind: Address already in use
> /tmp/ns is a binary executable, so I have no idea.

Tried strings-ing it? anything interesting there?

> I have HUP-ed bind (named) to restart it, but that didn't fix the
> problem. I tried to get some debug info by sending a USR1, it worked the
> first time but after I tried to increase the debug level, it stopped
> writing to the named.run file (in /var/named/). 
>  
> I had it dump it's stats, memstats, and database, but I can't seem to
> decipher WHAT address "is already in use". 

It's not necessarily bind.  It's probably the ns file

> I have never changed my named.conf or my named.local (they say they were
> written some day in June, which corresponds to when we moved to RH6.0
> from 5.2). It has been having problems since November 2nd. I upgraded
> eggdrop on October 30th, but I cannot think of any other software I
> installed/upgraded around that time.

When was that ns thing get put there? When did it start getting run by
cron? When did it start erroring.

> I have no other cron-constant jobs, just this one... it's obviously bind
> (as named) and the "ns" file is probably cryptic for nameserver. I
> didn't even KNOW cron looked around to find other cron jobs... I thought
> they all had to be in the crontab (redirecting to cron-constant,
> cron-hourly, etc).

I don't think it's necessarily bind/named.  There is a function called
bind which binds a socket to an address,  and it looks like it's
erroring since its trying to bind an address aready in use.  generally,
when I write a program, i put the function name in the error message
rather than the executable name, since I assume that you know what
program you're running.

> I wish to fix this problem, I don't want to have to delete 12,000 emails
> from root's mailbox again ;o)

For a temporary fix, you can probably mv or rm the /tmp/ns an cron
file.. and kill it if any is running. What I would try is playing with
/tmp/ns to see what it does (strings, strace, looking at who put it
there and when, etc).  Personally, i'd be pretty suspicious of it,
especially if i had no idea how it got there, though running it one more
time probably wouldn't hurt if it's already been run 12,000 times...

> sendmail's story:
> I need to HUP sendmail (linuxconf decided it'd be fun to write a new
> sendmail.conf thus ending virtual domains for mail), but the pid in
> /var/run/sendmail.pid  doesn't seem to work: 
>  
> [root at ghettoBOX run]# kill -HUP 540
> kill: (540) - No such pid
>  
> We were thinking maybe sendmail got restarted on it's own, but I believe
> it would write a new PID to that file either way. This PID was written
> october 20th (when last I rebooted).
>  
> I'd like to fix sendmail, without it we are having email problems ;o)
>  
> I am thinking these problems may be fixed with a reboot, but that fix
> seems so windows-y... I'd rather FIX the problem than be perplexed as to
> what it was in the first place and just say "oh well".

You could try finding the real pid for sendmail with ps, then either HUP
it (and fix its pidfile) or kill and restart it. 

************
techtalk at linuxchix.org   http://www.linuxchix.org