I am having a really bad time with these kernels; system just locksup. Very first time I have ever seen anything like this in 9 years of using Linux.
I am wondering if anyone else has experienced this or has any answers.
Some background; the motherboard on my webserver failed after some 8 years of constant use. I replaced with an MSI K9N6PGM2 + amd 64 1640 single core athlon. I have 2 of these boards both running the same kernel (2.6.26-2-amd64). The one in my personal box has been perfect with a dual core. The server has been a nightmare. The first crash is documented;
Quote:
Aug 11 12:17:25 webserver kernel: [362262.911207] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
Aug 11 12:17:25 webserver kernel: [362262.911207] IP: [<0000000000000000>]
Aug 11 12:17:25 webserver kernel: [362262.911207] PGD 39d30067 PUD 3a464067 PMD 0
Aug 11 12:17:25 webserver kernel: [362262.911207] Oops: 0010 [1] SMP
Aug 11 12:17:25 webserver kernel: [362262.911207] CPU 0
Aug 11 12:17:25 webserver kernel: [362262.911207] Modules linked in: xt_tcpudp ipt_ULOG ipv6 ipt_MASQUERADE iptable_nat dm_snapshot dm_mirror dm_log dm_mod f71882fg eeprom powernow_k8 freq_table i2c_dev usbhid hid ff_memless tulip nf_nat_snmp_basic xt_TCPMSS xt_owner ipt_ttl xt_mark sysv xt_state xt_multiport xt_limit iptable_mangle iptable_filter ipt_REJECT ipt_REDIRECT nf_nat nf_conntrack_ipv4 xt_MARK ipt_LOG ip_tables x_tables nf_conntrack_ftp nf_conntrack msdos fat nls_base minix binfmt_misc dummy serio_raw i2c_nforce2 snd_hda_intel button i2c_core psmouse snd_pcm snd_timer snd soundcore snd_page_alloc k8temp evdev floppy pcspkr ext3 jbd mbcache ide_disk ata_generic sata_nv libata scsi_mod dock forcedeth amd74xx ide_core sundance mii ehci_hcd ohci_hcd thermal processor fan thermal_sys
Aug 11 12:17:25 webserver kernel: [362262.911207] Pid: 3188, comm: ntpd Not tainted 2.6.26-2-amd64 #1
Aug 11 12:17:25 webserver kernel: [362262.911207] RIP: 0010:[<0000000000000000>] [<0000000000000000>]
Aug 11 12:17:25 webserver kernel: [362262.911207] RSP: 0018:ffff810039d3be50 EFLAGS: 00210246
Aug 11 12:17:25 webserver kernel: [362262.911207] RAX: ffff810039d33ff9 RBX: ffff810039fd9260 RCX: ffff8100399b52a0
Aug 11 12:17:25 webserver kernel: [362262.911207] RDX: f8e027544085ad69 RSI: 0000000000000050 RDI: ffff810039d3be74
Aug 11 12:17:25 webserver kernel: [362262.911207] RBP: ffff810039d3be58 R08: 0000000000000000 R09: 0000000000000063
Aug 11 12:17:25 webserver kernel: [362262.911207] R10: ffff810039d3a000 R11: ffffffff80224de4 R12: 00000000ffe51ebc
Aug 11 12:17:25 webserver kernel: [362262.911207] R13: 0000000000000000 R14: ffff810039d3bf58 R15: ffff810039d3bf34
Aug 11 12:17:25 webserver kernel: [362262.911207] FS: 0000000000000000(0003) GS:ffffffff8053b000(0063) knlGS:00000000f7c87920
Aug 11 12:17:25 webserver kernel: [362262.911207] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
Aug 11 12:17:25 webserver kernel: [362262.911207] CR2: 0000000000000000 CR3: 00000000399ab000 CR4: 00000000000006e0
Aug 11 12:17:25 webserver kernel: [362262.911207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 11 12:17:25 webserver kernel: [362262.911207] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 11 12:17:25 webserver kernel: [362262.911207] Process ntpd (pid: 3188, threadinfo ffff810039d3a000, task ffff810039fd9260)
Aug 11 12:17:25 webserver kernel: [362262.911207] Stack: ffff810039d3beb0 80213299ffff037f 00000000ffffffff 0000000000000010
Aug 11 12:17:25 webserver kernel: [362262.911207] 000000000000002b 0000000000000000 0000000000000000 0000000000000000
Aug 11 12:17:25 webserver kernel: [362262.911207] f40c9af378000000 bb4248003fff8bce 00003fff8dc6c17b 0000000000000000
Aug 11 12:17:25 webserver kernel: [362262.911207] Call Trace:
Aug 11 12:17:25 webserver kernel: [362262.911207] [<ffffffff80226881>] ? ia32_restore_sigcontext+0x1a8/0x20f
Aug 11 12:17:25 webserver kernel: [362262.911207] [<ffffffff802270f3>] ? sys32_sigreturn+0xcd/0xf2
Aug 11 12:17:25 webserver kernel: [362262.911207] [<ffffffff802250f5>] ? ia32_ptregs_common+0x25/0x4c
Aug 11 12:17:25 webserver kernel: [362262.911207]
@ug 11 12:17:25 webserver kernel: [362262.911207]
Aug 11 12:17:25 webserver kernel: [362262.911207] Code: Bad RIP value.
Aug 11 12:17:25 webserver kernel: [362262.911207] RIP [<0000000000000000>]
Aug 11 12:17:25 webserver kernel: [362262.911207] RSP <ffff810039d3be50>
Aug 11 12:17:25 webserver kernel: [362262.911207] CR2: 0000000000000000
Aug 11 12:17:25 webserver kernel: [362262.922814] ---[ end trace a9bbc159d44ae3c5 ]---
|
Whether this is due to the 32 bit nptd I have no idea. Thereafter the system has seldom been alive more than 6 hours. There are no more errors; it just stops working. The only sign that the system has crashed is that the system is madly pinging the adsl router that it is connected to.
The other peculiar thing is that iptables limit module is coming up as an error;
Quote:
Aug 15 22:15:25 webserver kernel: ip_tables: limit match: invalid size 40 != 28
|
According to one of the firewall authors this is an error (32 bit) in the kernel module.
Have now reverted to using a 686 kernel which has been running 12 hours with no problems.
A Google search indicates that most people having problems are using debian or ubuntu. Is this a problem specifically related to our kernels?