LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Kernel Panic - not syncing: Aiee, killing interrupt handler (https://www.linuxquestions.org/questions/linux-general-1/kernel-panic-not-syncing-aiee-killing-interrupt-handler-663167/)

sampoo55 08-16-2008 03:52 AM

Kernel Panic - not syncing: Aiee, killing interrupt handler
 
Hi folks,

for a few months now I'm facing kernel panics every few days on a Ubuntu Server 8.04 installation running on a remote server. After having installed and tried several other versions of the kernel, without success, I made my step towards Debian itself. Two days ago I reinstalled the whole system using the latest Debian Stable but restored the /var and /home directories, and based some /etc configurations on what I previously had. Everything ran smoothly until yesterday I faced a Kernel Oops message when logged in to the terminal. I somehow ignored it since the system still ran fine. Until this morning, the system seems to have frozen up (I cannot login, nor does the website work, or the mail server); exactly what I had when it faced a kernel panic, so I'm guessing it has ran into one again. I cannot reboot the system until monday (on the ubuntu installation I set the kernel.panic variable in /etc/sysctl.conf without effect (why is that?)).

I've not been able to isolate the problem yet, but I managed to take two pictures of the screen (I know it's not much) from when it ran ubuntu. I'm guessing it must be somewhat the same problem causing this trouble over and over again.

First Panic
Second Panic

I have no clue at all where to look at, or where to start to solve this very annoying problem (it's a webserver so people depend on it somehow).

Thanks in advance :)

vharishankar 08-16-2008 04:00 AM

From the error messages, it appears to be a hardware related problem definitely. Try to isolate the component which is giving you trouble. I'd start with the RAM.

You normally won't get kernel panics like this for normal software related issues.

sampoo55 08-16-2008 04:02 AM

Just because I'm curious: how can you see that?

About a month ago i ran memtest86 for about half an hour and that didn't report any error... What other tools could you suggest?

vharishankar 08-16-2008 04:09 AM

Quote:

Originally Posted by sampoo55 (Post 3249427)
Just because I'm curious: how can you see that?

About a month ago i ran memtest86 for about half an hour and that didn't report any error... What other tools could you suggest?

I'm not sure about memtest86 as I've not used it myself. Also it might not necessarily be the memory. Since I don't have the hardware with me, I cannot say what the problem is.

In my experience, whenever I've experienced Kernel panics, it's either:

a. A badly corrupted installation of Linux (where the file system or essential system files are trashed)
b. Hardware related error

sampoo55 08-16-2008 09:31 AM

I managed to get the new logs for the latest kernel panic. It's as follows:

Aug 15 16:39:51 vgkfgen1 kernel: Unable to handle kernel NULL pointer dereference at 0000000000000050 RIP:
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff802aa2b7>] __dec_zone_page_state+0x1b/0x6c
Aug 15 16:39:51 vgkfgen1 kernel: PGD 1c254067 PUD 29511067 PMD 0
Aug 15 16:39:51 vgkfgen1 kernel: Oops: 0000 [1] SMP
Aug 15 16:39:51 vgkfgen1 kernel: CPU 0
Aug 15 16:39:51 vgkfgen1 kernel: Modules linked in: it87 hwmon_vid i2c_isa eeprom i2c_dev tcp_diag inet_diag nfs nfsd exportfs lockd nfs_acl sunrpc ipv6 button ac battery dm_snapshot dm_mirror dm_mod loop serio_raw i2c_piix4 snd_hda_intel snd_hda_codec parport_pc parport i2c_core pcspkr snd_pcm snd_timer snd soundcore psmouse shpchp pci_hotplug snd_page_alloc evdev ext3 jbd mbcache ide_cd cdrom sd_mod atiixp ehci_hcd generic ide_core r8169 ahci libata scsi_mod ohci_hcd thermal processor fan
Aug 15 16:39:51 vgkfgen1 kernel: Pid: 177, comm: pdflush Not tainted 2.6.18-6-amd64 #1
Aug 15 16:39:51 vgkfgen1 kernel: RIP: 0010:[<ffffffff802aa2b7>] [<ffffffff802aa2b7>] __dec_zone_page_state+0x1b/0x6c
Aug 15 16:39:51 vgkfgen1 kernel: RSP: 0018:ffff810037b0bc38 EFLAGS: 00010016
Aug 15 16:39:51 vgkfgen1 kernel: RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000001
Aug 15 16:39:51 vgkfgen1 kernel: RDX: ffff8100370bc848 RSI: 0000000000000005 RDI: ffff8100006d84c8
Aug 15 16:39:51 vgkfgen1 kernel: RBP: ffff810037b0be70 R08: ffff810036520ae0 R09: 0000000000000000
Aug 15 16:39:51 vgkfgen1 kernel: R10: ffff8100223139d0 R11: 0000000000000001 R12: ffff8100370bc848
Aug 15 16:39:51 vgkfgen1 kernel: R13: 0000000000000002 R14: ffff810036bb47f0 R15: 0000000000000000
Aug 15 16:39:51 vgkfgen1 kernel: FS: 00002b361c863f60(0000) GS:ffffffff80520000(0000) knlGS:0000000000000000
Aug 15 16:39:51 vgkfgen1 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Aug 15 16:39:51 vgkfgen1 kernel: CR2: 0000000000000050 CR3: 000000000a1da000 CR4: 00000000000006e0
Aug 15 16:39:51 vgkfgen1 kernel: Process pdflush (pid: 177, threadinfo ffff810037b0a000, task ffff810037ae6770)
Aug 15 16:39:51 vgkfgen1 kernel: Stack: ffffffff802aa4c4 ffff8100006d84c8 ffffffff8022982d ffff810036bb47f0
Aug 15 16:39:51 vgkfgen1 kernel: ffffffff8021ac6d 0000000000000000 0000000e00000000 0000000000000000
Aug 15 16:39:51 vgkfgen1 kernel: ffffffff880ed283 ffffffffffffffff 0000000000000002 000000000000000e
Aug 15 16:39:51 vgkfgen1 kernel: Call Trace:
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff802aa4c4>] dec_zone_page_state+0x9/0xd
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8022982d>] clear_page_dirty_for_io+0x45/0x57
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8021ac6d>] mpage_writepages+0x183/0x34d
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff880ed283>] :ext3:ext3_ordered_writepage+0x0/0x198
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff80256452>] do_writepages+0x29/0x2f
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8022dc59>] __writeback_single_inode+0x1b4/0x38b
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8021ede0>] sync_sb_inodes+0x1d1/0x2b5
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8028f8fc>] keventd_create_kthread+0x0/0x61
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8024c66e>] writeback_inodes+0x7d/0xd3
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff802a803c>] background_writeout+0x82/0xb5
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff802520cf>] pdflush+0x0/0x1ed
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff80252212>] pdflush+0x143/0x1ed
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff802a7fba>] background_writeout+0x0/0xb5
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff802305dc>] kthread+0xd4/0x107
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff80258aa0>] child_rip+0xa/0x12
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8028f8fc>] keventd_create_kthread+0x0/0x61
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff8026ea6b>] physflat_send_IPI_mask+0x0/0x6a
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff80230508>] kthread+0x0/0x107
Aug 15 16:39:51 vgkfgen1 kernel: [<ffffffff80258a96>] child_rip+0x0/0x12
Aug 15 16:39:51 vgkfgen1 kernel:
Aug 15 16:39:51 vgkfgen1 kernel:
Aug 15 16:39:51 vgkfgen1 kernel: Code: 49 8b 54 c1 50 4c 8d 44 32 41 41 8a 00 ff c8 41 88 00 8a 52
Aug 15 16:39:51 vgkfgen1 kernel: RIP [<ffffffff802aa2b7>] __dec_zone_page_state+0x1b/0x6c
Aug 15 16:39:51 vgkfgen1 kernel: RSP <ffff810037b0bc38>
Aug 15 16:39:51 vgkfgen1 kernel: CR2: 0000000000000050


Can you make up more information out of that? Thanks :)

r3sistance 08-16-2008 10:15 PM

Everything in that log before the code is pretty much meaningless, everything that occurs before the kernal panic is pretty much things that happened successfully (that is ignoring the deferencing a Null Pointer, fairly sure that's not related to the problem at hand). I would check both your hardware and bios. You could also see if you can get into single user mode, if can you do that you might actually be able to get at any logs the kernal maybe producing just before it panics.


All times are GMT -5. The time now is 04:07 PM.