Cause of server crash
Hi folks,
Here is the scenerio. I got a new dedicated server from softlayer.com 3 months ago, it's been running perfectly ever since. So, last week I decided to upgrade the memory from 2GB to 3GB. A couple of days later, it crashed, then 3 days later it happened again. The first time, server came up after a remote reboot. However, the second timem, it wouldn't come up. The tech said the memory was not seated correctly. It booted up fine after he re-seated them. And then today, it crashed again. :confused: The system specs: Intel Xeon 3060 3GB ECC DDR memory 2x250GB SATAII HDD CentOS 4.5 32-bit with Cpanel 10 Kernel 2.6.9-55.ELsmp #1 SMP Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux I'm suspecting it's due to faulty memory. I noticed that swap usage is always greater than .5% before the crash. 2GB is allocated to swap partition. Could it have anything to do with swap usage? How about the kernel version, anything that may have caused it to crash/hang? I thought that having 3GB of RAM is odd since most servers have an even numbered amount. Maybe I should upgrade to the latest stable CentOS kernel. I asked the tech to swap out the ram and test the memory. If you guys have any ideas, please post it here. Thanks in advance. |
I'd try to simplest thing first: take out the new memory and see if that restores system stability.
|
You have ECC memory, but it sounds like you don't have ECC enabled into your BIOS (or bit errors would be corrected). Does your motherboard have chipkill functionality (to remove an entire RAM chip in the event of multiple failures)? If so, is the memory compatible and the function enabled? Have you confirmed that the memory is in fact ECC capable?
Boot memtest86 to check your RAM. |
Quote:
The tech replaced all 4 sticks of memory, I hope it won't crash again. :( |
All times are GMT -5. The time now is 11:18 AM. |