Quote:
Originally Posted by computersavvy
It is also worth inspecting the caps the next time the system is down just in case. Certain brands and styles are known to have failures that take down the system in strange ways. I don't remember the details, but in the late 90s and early 2000s there were some brands of motherboards known for cap failures that would take out the system.
|
This was/is known as the 'cap plague' and was an upstream manufacturing issue at some large Taiwanese suppliers. I forget exactly what 'went wrong' but it had to do with bad batches of electrolyte (it only affects electrolytic capacitors), but I don't remember if that was just a chemistry mix-up or environmental (e.g. being produced somewhere with different weather/humidity/etc). It affected a massive range of hardware more or less indiscriminately, if they used caps from any of these suppliers (I forget which all suppliers were on the list, I believe Teapo was one of them though). This is also why 'Japanese capacitors' have become a marketing point (because those factories were largely unaffected by it), despite the issue being largely resolved in the mid-2000s. Wikipedia has an article about it:
https://en.wikipedia.org/wiki/Capacitor_plague
Depending on the age of these servers, this is absolutely something to consider, but newer systems tend to have much less to worry about in terms of 'the plague.'
As far as cleaning the contacts on the RAM - I've seen PCIe graphics cards that refuse to engage/negotiate at the full x16 (and instead settle for x2 or x4 in an x16 slot), and after cleaning the card's contacts and blowing dust out of the motherboard slot, everything worked again. FWIW, I'd give it a try if it isn't too tedious to do (I know some servers can have silly numbers of individual DIMMs to deal with). Something else to consider, if these are really 'big' servers, if the RAM and/or CPUs are on risers, those can come unseated or (presumably) need their contacts blown out from dust/debris too - I've seen a handful of Compaq Proliant machines brought down just by risers being slightly unseated due to being moved.
I also agree with jefro's suggestion and would add that I don't envy having to troubleshoot this.