LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Kernel panic in x86_64 with 6Gb memory (https://www.linuxquestions.org/questions/linux-hardware-18/kernel-panic-in-x86_64-with-6gb-memory-563732/)

rpg 06-22-2007 10:48 AM

Kernel panic in x86_64 with 6Gb memory
 
I just added 4Gb of memory to my x86_64 machine, running Mandriva 2007.0 x86_64, kernel 2.6.17-5mdv, ASUS motherboard, and was dismayed to find that it would not boot. Instead it gets a kernel panic while booting.

The panic message was "<0>Kernel panic: Aiee, killing interrupt handler! In interrupt handler - not syncing" and the backtrace was:

class_device_create+123
snprintf+68
map_vm_area+583
pci_confl_write+204
agp_frontend_initialize+47
pci_device_probe+243
driver_probe_device+101
__driver_attach+0
__driver_attach+96
__driver_attach+0
bus_for_each_dev+73
bus_add_driver+136
__pci_register_driver+87
init+356

[I copied these by hand; my apologies for any garbling.]

I have tried suggestions for booting with acpi=off, pci=nommconf, pci=bios, noapic, and nolapic, with no success. I gather there might be a mem=6144M option or something like that, too....

I am able to boot with only 2Gb, the memory is recognized by the BIOS, and I don't see any trouble in the full 6Gb with memtest.

Any suggestions?

Many thanks,
Robert

Hern_28 06-22-2007 10:54 AM

Not sure with mandrake.
 
I have never used mandrake, but I had to recompile my os to support large amounts of ram, might need to check and see if ou need to enable the support and then try adding it.

rpg 06-22-2007 11:00 AM

As I understand things, Mandriva/Mandrake used to have a 3GB and up special kernel (I may be misremembering, it may have been 4Gb and up), but they seem to have replaced their old large set of kernels with a single, general one.

If I had a reason to believe it might fix things, I'd be happy (well, ok, willing, if not happy!) to compile my own kernel. I used to do this all the time, but fell out of the habit.

What keeps me from doing this is the suspicion that there may be just some thing I should be doing that I'm too dumb to do; some simple configuration option I'm missing. I hate to spend hours and hours on kernel configuration and compilation if it's just a kernel option that would fix things...

It would help, of course, if it were easier to test --- having to power down, crack the case open, put in the sticks, restart, and then power down and remove the memory, every time I want to test will be very painful. So if there's some way to "turn off" the memory (kernel option?) so that I could run with the memory in but not messing things up, that would be very helpful.

HappyTux 06-22-2007 04:48 PM

Quote:

Originally Posted by rpg
So if there's some way to "turn off" the memory (kernel option?) so that I could run with the memory in but not messing things up, that would be very helpful.

mem=2048M appended to your boot line should do it I believe, another thing you might want to check is if you have a 64bit mode for your memory in the BIOS like mine does and you might want to try booting with something like the Debian Etch installer DVD to see if it will see the whole 6gb then of course the always fun check for then install new BIOS to see if that solves it. BTW how are you getting the 6gb (2x1gb + 2x2gb) this could be causing problems by not having identical ram modules in there have you tried with just 4gb if this is the case.

rpg 06-22-2007 05:44 PM

I'm not entirely sure I understand. Why would mem=2048M be right instead of mem=6144M ?

Yes, I think there's a new BIOS to flash for my ASUS board. I'm going to try that, too, and see if that helps.

Wish I could do this w/o having to open up the machine, plug in the memory, test, pull out the memory, etc. Wish there was some way to "hide" the extra 4G and boot with only 2G without having to physically unplug it. That would make the test cycle faster... Any way to accomplish this?

jay73 06-22-2007 06:03 PM

That's the whole point of using mem=2048 - it tells the system to use only 2GB.

Check the available packages for your system; I'm sure you'll find one that fits your needs. I don't know a single distro that offers as many different kernels as Mandriva (dozens).

I doubt whether flashing your BIOS will help - at worst, an outdated BIOS will prevent your machine from seeing all of the RAM but it shouldn't lead to kernel panics. Also: be very very careful flashing your BIOS - ASUS recently had a BIOS update that caused absolute mayhem on quite a few systems and there was NO way to undo it without RMAing!

HappyTux 06-22-2007 06:06 PM

Quote:

Originally Posted by rpg
I'm not entirely sure I understand. Why would mem=2048M be right instead of mem=6144M ?

You said you only wanted 2gb when testing right that should give it to you.
Quote:

Yes, I think there's a new BIOS to flash for my ASUS board. I'm going to try that, too, and see if that helps.
A good idea that is usually the first thing anyone asks when your looking for help.
Quote:

Wish I could do this w/o having to open up the machine, plug in the memory, test, pull out the memory, etc. Wish there was some way to "hide" the extra 4G and boot with only 2G without having to physically unplug it. That would make the test cycle faster... Any way to accomplish this?
See above might even want to try with mem=4096 just for the hell of it, you still don't mention how many sticks there are/size of them.

rpg 06-22-2007 10:19 PM

Quote:

Originally Posted by HappyTux
You said you only wanted 2gb when testing right that should give it to you.

Ah! I see --- I didn't understand that this was the response to my question about how to test w/o moving the sticks in and out. The reason I was confused was that I had read something where someone had suggested telling the kernel this way about the full memory. Thank you very much for the advice.


Quote:

Originally Posted by HappyTux
See above might even want to try with mem=4096 just for the hell of it, you still don't mention how many sticks there are/size of them.

There are two original 1G sticks and 2 new 2G sticks.

Some have suggested the mismatch was problematic, but in the past when I've had that problem, the problem manifested in the BIOS, not in the Linux kernel.

I have also had it suggested to me that the kernel might not be able to handle more than 4 GB. That seems odd for a 64-bit kernel, but what do I know....

Thanks for your help,

Best,
R

HappyTux 06-22-2007 11:14 PM

Quote:

Originally Posted by rpg
Ah! I see --- I didn't understand that this was the response to my question about how to test w/o moving the sticks in and out. The reason I was confused was that I had read something where someone had suggested telling the kernel this way about the full memory. Thank you very much for the advice.

It works both ways to tell it about memory it is not finding or cut back to memory you want to use.



Quote:

There are two original 1G sticks and 2 new 2G sticks.
Same speed and timings for both sets of ram? If not you may want to be setting them manually in your BIOS instead of an auto setting to higher of the two for the timings and the lower for the speed of the worst of either pair of sticks, you might want to do this anyways even if they are identical. Oh and sometimes if the ram is of different brands then they don't want to work together.

Quote:

Some have suggested the mismatch was problematic, but in the past when I've had that problem, the problem manifested in the BIOS, not in the Linux kernel.
Yeah most times it won't see the new ram or will refuse to post.
Quote:

I have also had it suggested to me that the kernel might not be able to handle more than 4 GB. That seems odd for a 64-bit kernel, but what do I know....

Thanks for your help,

Best,
R
No that would not be the problem 64bit can address way more than your machine will hold. You might want to try this with just the new 4gb in the machine to see if that helps BTW what is the hardware in the machine we are talking about here. If it is an AMD machine then them things don't want to have all four slots used or at least in the three of them I owned it never would work with four sticks in there my Core 2 Duo however works just fine with 4 identical 1gb sticks in it.

HappyTux 06-22-2007 11:20 PM

You might also want to try a newer kernel if available 2.6.17 is rather old at this point in time.

Lsatenstein 06-24-2007 12:00 AM

The PAE kernel is a version designed for intel systems that can handle in excess of 4 gigs. Your mother board must be able to do likewise.

The 64 bit kernel and i386 versions appear, because of chip hardware constraints, limited to 4 gigs.

jay73 06-24-2007 02:22 AM

I thought PAE was used for 32 bit kernels only?

rpg 06-24-2007 10:04 PM

Quote:

Originally Posted by HappyTux
No that [i.e., the kernel not being able to address the full amount of memory...] would not be the problem 64bit can address way more than your machine will hold. You might want to try this with just the new 4gb in the machine to see if that helps BTW what is the hardware in the machine we are talking about here. If it is an AMD machine then them things don't want to have all four slots used or at least in the three of them I owned it never would work with four sticks in there my Core 2 Duo however works just fine with 4 identical 1gb sticks in it.

It's a core 2 duo with an ASUS P5B deluxe motherboard. I pulled out the original two sticks and put in only the 2x 2Gb sticks. I got the same kernel panic.

Adding the mem=2048 didn't seem to change anything, but I confess that I'm not entirely sure how to pass arguments to the kernel. I have just been doing

failsafe mem=2048

and also

failsafe mem=2048,noapic,nolapic,pci=nommconf

I think this suggests that the problem is from one of two sources:
  1. This particular pair of sticks is the problem or
  2. The kernel is the problem

It's clearly not just a problem of having mismatched sticks (the mismatched sticks might be a problem, but they're clearly not the only problem.

I will see about building a newer kernel. I am going to try to stick to one of the Mandriva kernels, recompiled so that I control the options myself, at first, but will then try a newer one from kernel.org.

If anyone can suggest anything special I should do about building a kernel for this machine, I'd be very grateful. What's a PAE kernel?

HappyTux 06-25-2007 08:53 AM

Quote:

Originally Posted by rpg
It's a core 2 duo with an ASUS P5B deluxe motherboard. I pulled out the original two sticks and put in only the 2x 2Gb sticks. I got the same kernel panic.

Adding the mem=2048 didn't seem to change anything, but I confess that I'm not entirely sure how to pass arguments to the kernel. I have just been doing

failsafe mem=2048

and also

failsafe mem=2048,noapic,nolapic,pci=nommconf

I take it that would be after hitting "e" at the grub boot screen if so then that is how you do it except you want spaces instead of , in the second one.
Quote:

I think this suggests that the problem is from one of two sources:
  1. This particular pair of sticks is the problem or
  2. The kernel is the problem

I doubt number 1 you say the ram with 6gb passed memtest so I can't see it, number 2 definitely the problem I would say.

Quote:

It's clearly not just a problem of having mismatched sticks (the mismatched sticks might be a problem, but they're clearly not the only problem.

I will see about building a newer kernel. I am going to try to stick to one of the Mandriva kernels, recompiled so that I control the options myself, at first, but will then try a newer one from kernel.org.

If anyone can suggest anything special I should do about building a kernel for this machine, I'd be very grateful. What's a PAE kernel?
PAE is used for 32bit to get around the memory mapping limitations that it has due to being 32bit it should not be of a concern to your install unless you have that turned on in your BIOS not using the 64bit memory option I mentioned earlier. The kernel upgrade is the idea 2.6.17 is old and around that time with 2.6.18 was a lot of changes that supported the family of chipsets that is used on that board/my board for that matter 2.6.18 is the bare minimum I need or I can't install linux on my machine and it really needs 2.6.19/20 to get full support for everything on it so try for .20 or better when getting your new kernel.

rpg 06-25-2007 09:50 AM

Progress report
 
Quote:

Originally Posted by HappyTux
I take it that would be after hitting "e" at the grub boot screen if so then that is how you do it except you want spaces instead of , in the second one.

Well, I have LILO instead of grub, but that sounds right...

By the way, when I have the two 1Gb sticks in the first pair of memory slots, I can now boot with mem=2048, which is a considerable relief to me! At least now I can work on this machine while getting it up to speed, and I don't need to be cracking the case open every time I need to test a possible fix!

Quote:

Originally Posted by HappyTux
I doubt number 1 you say the ram with 6gb passed memtest so I can't see it, number 2 definitely the problem I would say.



PAE is used for 32bit to get around the memory mapping limitations that it has due to being 32bit it should not be of a concern to your install unless you have that turned on in your BIOS not using the 64bit memory option I mentioned earlier. The kernel upgrade is the idea 2.6.17 is old and around that time with 2.6.18 was a lot of changes that supported the family of chipsets that is used on that board/my board for that matter 2.6.18 is the bare minimum I need or I can't install linux on my machine and it really needs 2.6.19/20 to get full support for everything on it so try for .20 or better when getting your new kernel.

That makes sense to me --- I believe that Mandriva had to patch this kernel to get it to work with the Core Duos (when I bought this machine, it was an attraction of Mandriva 2007.0 that it would handle the chips and mobo. Other distros weren't able to at the time.

I will try to investigate means of getting a more modern kernel with this Mandriva install. If it's not too painful, I will move up to Mandriva 2007.1, but I will first try the kernel-tmb...


All times are GMT -5. The time now is 08:43 AM.