[Techtalk] [Fwd: Re: [NFS] rpc.mountd at 99% cpu]
Rudy Zijlstra
rudy at edsons.demon.nl
Sat Oct 1 22:56:41 EST 2005
Rudy Zijlstra wrote:
> Dear all,
>
> reading this i remembered the question on this list to debug server
> slowness. Perhaps something to consider?
>
> Cheers,
>
>
> Rudy
Repost in different way. Apparently the list manager removed the
relevant info....
Rudy
-----------
<<<<<<<<<<>>>>>>>>>>
NFS Server Performance Troubleshooting
--------------------------------------
#
# 2005.09.30 Brian Elliott Finley
#
Problem:
- Poor performance in general
- Sometimes home directories not mounted on clients
- Bursty performance -- sometimes good, sometimes bad
- Performance got so bad at one point that nfs server service had to be
restarted.
Initial Observations:
rpc.mountd and/or nfsd sometimes taking 99% of one CPU
Initial Tuning:
/etc/default/nfs-kernel-server: s/RPCNFSDCOUNT=8/RPCNFSDCOUNT=32/
See: http://nfs.sourceforge.net/nfs-howto/performance.html
(section 5.6)
Further Observations:
rpc.mountd, nfsd, and ypserv (sometimes) tend to spike when a mount
request happens.
Timeouts in auto.master are set to 10s for 3 maps, and 160s for one
map, so mount requests happen very often. The autofs default
timeout is 5m (300).
Some entries in /etc/exports use hostnames, some use IP addresses
(herring or no?)
client1 using these ext3 mount options: noatime,data=writeback. nfs01
not using these.
client1 using async nfs exports (nfs-utils-0.3.1-14.72). The default
export behavior for both NFS Version 2 and Version 3 protocols,
used by exportfs in nfs-utils versions prior to nfs-utils-1.0.1
is "asynchronous". See:
http://nfs.sourceforge.net/nfs-howto/performance.html (section
5.9). nfs01 using synchronous Writes (nfs-kernel-server
1.0.6-3.1ubuntu1).
Second Level Tuning:
1.) Modify ext3 mount options to include noatime, and mount -o remount
each file system. No casually noticeable improvement.
2.) Try s/sync/async/ nfs01:/etc/exports for /export/home filesystem,
then re-export. No casually noticeable improvement. Changed
back to sync. async is technically non-NFS compliant.
3.) Change all hostnames to IP addresses in /etc/exports, then
re-export. No casually noticeable imrovement.
4.) Some hostnames, some IPs show up in /var/lib/nfs/etab, despite
having only IPs listed in /etc/exports. Tried this:
"exportfs -v -o \
rw,secure,sync,no_root_squash \
10.10.0.0/255.255.0.0:/export/home"
Then unexported everything else. No improvement.
5.) Used Ethereal to sniff the net -- no obvious issues there. A
re-transmit of a mount request, but that is likely due to slow
response time of rpc.mountd rather than a network problem.
6.) rpcinfo -p nfs01-priv.net.anl.gov from both nfs01, and from client
side look fine.
7.) No entries in /etc/hosts.{allow,deny} to get in the way.
8.) All client IP/Hostname combos in /etc/hosts on nfs01, so lookup
timeouts shouldn't be an issue, but tried commenting out nis on
the hosts line in /etc/nsswitch.conf just in case. No
improvement.
9.) Installed a bind9 daemon in a cacheing only config on nfs01, and
modified /etc/resolv.conf to look to itself first when
performing a DNS resolution. No improvement.
10.) Modified auto.master umount timeout, and upped it from 10s to just
over 1h. This significantly decreased the frequency of the
disruptions, but didn't address the underlying problem.
11.) Did this: "mount -t nfsd nfsd /proc/fs/nfsd" to try the "new
style" NFS for 2.6 series kernels. Without this, NFS behaves
like it's on a 2.4 series kernel. There was no specific reason
why this should cause an improvement, but it cut the following
ls command (which forces an automount of the three
filesystems in question) down from 30s to 7s.
sudo umount /home/{user1,user2,user3}
time ls -d /home/{user1,user2,user3}/.
12.) strace -p $(pidof rpc.mountd) didn't reveal anything particularly
odd, but it was done after /proc/fs/nfsd was mounted, and it did
try to read /proc/fs/nfsd/filehandle. I take this as further
indication that mounting /proc/fs/nfsd/ was key.
13.) client1 reboot determined necessary. I took this opportunity to do
a complete stop and restart of the NFS daemons and the
portmapper. Immediately upon a clean restart of these two (no
changes had actually been made to anything that should affect
portmap, but restarted it anyway just to get a perfectly clean
slate) mount performance improved from the 7s to execute "time
ls -d /home/{user1,user2,user3}/." down to 0.030s.
Bingo!
14.) I modified /etc/fstab to mount /proc/fs/nfsd on future boots.
Conclusion:
Prior to Second Level Tuning step 13, the NFS daemons hadn't been
restarted since the Initial Tuning, which only increased the number
of nfsd threads. This means that Second Level Tuning steps 1 - 12 were
performed without a restart of the nfs daemons.
Which step was key? Was it a combination of them? Well, I can't
say with 100% certainty without backstepping and restarting the nfs
daemons each time, which obviously causes an service outage, but seeing
as how step number 11 was the only step that provoked any kind of
improvement prior to restarting the daemons, and was the only part of
the configuration that was clearly expected to change behavior in a
significant way (2.4 series kernel style behavior vs. 2.6 series kernel
style behavior), my current theory is this:
The mounting of /proc/fs/nfsd was key. And while the mounting
of it had an immediate effect for certain nfs daemon functions,
it didn't have an immediate effect for others. Once the nfs
daemons were completely flushed and restarted cleanly,
it had an effect on all relevant nfs daemon functions.
More information about the Techtalk
mailing list