Share your knowledge at the LQ Wiki.
Go Back > Forums > Non-*NIX Forums > General
User Name
General This forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!


  Search this Thread
Old 11-20-2012, 05:03 AM   #1
Senior Member
Registered: Apr 2004
Location: Potchefstroom, South Africa
Distribution: Fedora 17 - 3.3.4-5.fc17.x86_64
Posts: 1,508

Rep: Reputation: 100Reputation: 100
Strange (but fatal) recurring rackmount problems

Hi Guys

I'd like some input... here's the situation.

I had a normal desktop box in a normal tower case on which I setup a Centos 6 DHCP, DNS and a Samba PDC.

We have a rack mount setup at work which already contains 12 other servers. All are built into rack mount trays.

I had the board, hdd and power supply built into a standard rack mount tray by an external provider, and they installed the tray into the rack for me. The tray is the exact same as the trays that contain our 12 other servers.

So I started the machine, and all was fine - it was doing DHCP, DNS and PDC duties. I tested it for several hours, then went home at end of business.

Came in the next morning and it was dead. Pulled the tray, had it taken apart, and it had melted - you can actually see where the traces on the motherboard melted and flowed together. The power supply is fine, had it measured and it is outputting as it should. So...

Replaced the motherboard, put in another power supply, CPU, and HDD with a different model. Reinstalled Centos, re-setup the DHCP, DNS and PDC servers. Had it all installed in the same physical tray.

Went home.

Came back, melted a SECOND time. Same parameters, motherboard totally destroyed, CPU gone, and HDD dead.

Other 12 machines are fine and running 100%. All the network switches and routers mounted in the same rack also fine.

Only factor is the tray itself, and the rack - I went over it with a fine tooth comb, there are no projections or irregularities - it is properly spaced, so it appears not to be a short-to-case or something similar.

Thing is as well, it WORKS fine for about 12 or 14 hours, but leave it anything longer than 24 hours and hardware in that tray is promptly destroyed.

It is getting quite expensive... any ideas what I can try / do? All that is left to change is the tray itself, but if the tray is the culprit, why fail after an indeterminate amount of time - not immediately, if it is a short or something similar?

It is a properly stabilized server room with an ambient temp of about 16 deg C and stabilized, protected powersupply with auto-start generator backup. There have been no electrical events anywhere nearby, no need to fall back to generator, or any other salient events. All the other machines (even the one in the adjacent tray, about 20 centimeters lower, vertically) are fine and running 100%.

Any ideas or comments? What the flaming h...l could be going on that keeps smoking the hardware I try to add to the rack?

Last edited by rylan76; 11-20-2012 at 05:05 AM.
Old 11-20-2012, 02:23 PM   #2
Registered: Aug 2008
Location: Nova Scotia, Canada
Distribution: Slackware, OpenBSD, others periodically
Posts: 512

Rep: Reputation: 139Reputation: 139
Very simply sounds like a simple problem of rather massive overheating.

Have you checked all the heatsink and case fans, as well as the ventilation path in the rack itself?

If this is the top unit in the rack, you need to ensure it's not being unduly heated by the units below it. Improper air circulation can turn a rack cabinet into a small blast furnace and the top unit gets the brunt of the hot air flow.
Old 11-21-2012, 03:57 AM   #3
Senior Member
Registered: Apr 2004
Location: Potchefstroom, South Africa
Distribution: Fedora 17 - 3.3.4-5.fc17.x86_64
Posts: 1,508

Original Poster
Rep: Reputation: 100Reputation: 100

Thanks for the reply. Yes, this unit is the top unit in the rack.

I've not put a max / min thermometer into the server room. Ambient perceived temp is quite cool (I've stood in there and it feels about 21 deg C on the skin).

Hmm - top unit - I'll mention that to my manager. though it doesn't seem to get too hot.

Anyway, thanks for the pointers and taking the time to respond!

Kind regards
Old 11-21-2012, 07:24 AM   #4
Registered: Sep 2011
Location: Scotland
Distribution: Debian
Posts: 84

Rep: Reputation: 109Reputation: 109
Whatever's going on, it's certainly impressive. Could replace the motherboard and other components with a frozen pizza? If it cooks you'll have ruled out any electrical short in the case, as well as finding a good use for the heat.
Old 11-21-2012, 11:37 AM   #5
LQ Guru
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288Reputation: 1288
It sounds unlikely. What kind of CPUs were installed ? Most are able to throttle down or at least cut power before melting down. Can you identify the source of the meltdown ? Maybe it wasn't the CPU...
Old 11-21-2012, 04:52 PM   #6
Registered: Aug 2008
Location: Nova Scotia, Canada
Distribution: Slackware, OpenBSD, others periodically
Posts: 512

Rep: Reputation: 139Reputation: 139
The ambient in the server room may be irrelevant if the case is in a rack enclosure. In this case the heat gets contained within the rack.

I admit, even in the hottest rack cases, I have never seen one get hot enough to reflow the solder on a circuit board.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recommend Rackmount Server for Debian slacky Debian 1 10-04-2006 07:46 AM
Rackmount hardware for fedora... jedimastermopar Fedora 3 10-19-2005 04:19 PM
Things to look out for when purchasing rackmount cases? BrianK General 5 09-18-2003 11:40 AM
Rackmount Server linux distro crithke Linux - General 2 08-14-2003 07:34 PM
rackmount cases arainx General 0 02-18-2002 04:48 PM > Forums > Non-*NIX Forums > General

All times are GMT -5. The time now is 09:19 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration