[Techtalk] Running servers too hot?

Sarah Newman newmans at sonic.net
Wed Oct 8 04:07:40 UTC 2008


If Maria's way doesn't work, you might try digging around in 
/proc/acpi/thermal_zone/THRM, but I am guessing if one doesn't work the 
other method won't.

If neither works you might try opening the case and getting an IR 
thermometer and measuring the temp when it hangs, but probably opening 
the case will change the heat properties too much.

I think there are other things that can make your system hang - what 
have you looked at so far?  Maybe you could isolate heat as the source 
by putting one in an AC cooled room or eliminating other potential 
sources of error, such as memory? You can use memtest86+ for the latter.

Are you using ESD protection when you put the servers together?  In 
other's experience (not my own) lack of ESD protection when putting the 
servers together can cause mysterious hardware issues.  What percentage 
of your servers are hanging?

Do you have access to the console?  Is the kernel rebooting or hanging 
on kernel panics?  I just learned that can be configured. :)

Kelly Jones wrote:
> How hot can 1U servers run, and where exactly do you take their temperature?
> 
> Can you measure the heat of their power supply exhaust or something?
> 
> I think we're running our servers too hot, but no one else believes
> me, so I need some sort of evidence/proof.
> 
> They hang frequently, but my cow-orkers blame demons and ghosts.
> 


More information about the Techtalk mailing list