LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Kernel Panic on two systems with the same components. (https://www.linuxquestions.org/questions/linux-newbie-8/kernel-panic-on-two-systems-with-the-same-components-4175547614/)

Karim86 07-09-2015 12:03 PM

Kernel Panic on two systems with the same components.
 
Greetings

Please excuse me if im posting on the wrong forum.

I have two Supermicro servers that we have build with the same specs Both have 11 HDD 10 On a Raid 6 with one Hot Spare
Supermicro chasis CSE-216E16-R1200lPB
Supermicro Mb 10XDRI-0
2x Intel Haswell 4C E5-2623V 3Ghz
11x 2.5" 1Tb Seagates ST100NX0313
8x 4GB DDR4-2133
LSI/3ware RAID 9750-4i with BBU

Usually the way it works for this costumer, we build their servers and then we using a kickstart image to load their custom CentOS which i believe is CentOS 5
I built one and then when the OS was formatting the partitions it got stuck, restarted the process and it started giving me I/O errors, replaced the drive which we believed to be bad with a new one,and the OS loaded correctly.. went to restart the server and got the kernel panic, we replaced everything piece by piece even the motherboard, except for the processors and still got the kernel panic event.
Yesterday we got another order for the exact same system, everything went smooth, then when it came to restart the server to make sure everything its working, again the kernel panic event.
Im pretty much new to linux so im out of ideas.

I was trying to add a picture of the kernel panic but I couldnt figure out how to attach or paste the picture:redface:

jpollard 07-09-2015 12:25 PM

No one can help without the errors...

You might try using a serial console and recording the session that way.

Karim86 07-09-2015 12:47 PM

1 Attachment(s)
I have finally attached a picture of the kernel panic error, im hoping this helps.

jpollard 07-09-2015 07:39 PM

Have you run memtest on it?

Boot failures due to page fault errors are rather unusual, unless some bad memory happens to exist.

The other advantage of memtest is that it will also exercise the CPU and report any errors there too.

rokytnji 07-09-2015 08:04 PM

If me. I'd boot a live session and check with gparted if the partition is full of writes.
Something I have seen happen on linux installs on flash drives improperly unmounted.

If all yellow in gparted. One needs to unmount with right click. Then do another right click to do a file system check and fix. When done. The yellow should have shrunk.

If not that. Then a grub repair may be in order instead. Like a super grub disk/rescatux iso tool may have to be pulled out of the toolbox.

Lastly. If the same iso was used on both installs but was never md5sum checked. That would be another avenue of investigation I would take.

Just a linux using scooter tramp. So my guesses will be simple.

yooy 07-09-2015 08:10 PM

if you have older kernels on that machines (your machines were upgraded) you can boot older kernel or wait for a new one. However i'm only sure this works on Ubuntu.

Karim86 07-15-2015 09:09 AM

Thank you all for your help, sorry for the late reply, I moved on to the next builds so we won't get behind.
Anyways.. during down time I played around with those troublesome builds found out a few things.
The custom version of CentOS that we get from the costumer is CentOS 5.7 32bit
What i did was load a regular CentOS 5.7 but 64bit and the server boots up fine everytime.
Loaded CentOS 6.5 64bit and again it boots up fine, but on 6.4 i get a warning of unsupported CPU found Intel Family, which is correct but it still boots up fine.
I stressed test RAM and CPUs and it all passed.
The motherboard BIOS is the latest according to the SuperMicro site.
Upgraded the firmware on the Raid card, just in case.
Underclocked the RAM in the BIOS and of course it didnt make no difference.
All this tells me its the problem its the costumers custom CentOS, which they refuse to change to 64bit or even a newer version..
Any input on this would be greatly appreciated.

Thanks for your time


All times are GMT -5. The time now is 04:22 PM.