LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 08-08-2004, 07:34 PM   #1
Donboy
Member
 
Registered: Aug 2003
Location: Little Rock, Arkansas
Distribution: RH, Fedora, Suse, AIX
Posts: 736

Rep: Reputation: 31
kernel panic Aiee, killing interrupt handler


I'm having a problem doing rsync to another machine. Here's the setup...

I've got my first machine with all of my data. I've got a second machine with a 160gig drive mounted as a slave. I've got the 160gig mounted over NFS to the first machine. So what I'm doing is... every night via cron, I'm running rsync on the first machine and it's copying data to the 160gig drive that is mounted on the first via NFS.

When I run it normally during the week, there is not a lot of data being moved. Usually everything works fine. However, on the weekends I'm doing a bigger backup and its throwing kernel errors on the machine that holds the 160gig drive. When the error happens, it completely freezes the backup machine and I have to reboot to get everything back to normal. Thankfully the cron is just paused while the backup machine is frozen and when it comes back online the backup continues to run. So the backups are doing good, but I still hate rebooting and I'm sure the kernel errors need to be fixed.

When I boot up, I have been running fsck on the 160gig and the check seems to pass without any problems.

Here's the error...

Quote:
Aug 8 04:32:24 xwing kernel: Unable to handle kernel paging request at virtual address 6a63c0c4
Aug 8 04:32:24 xwing kernel: printing eip:
Aug 8 04:32:24 xwing kernel: c0145119
Aug 8 04:32:24 xwing kernel: *pde = 00000000
Aug 8 04:32:24 xwing kernel: Oops: 0002
Aug 8 04:32:24 xwing kernel: nfsd lockd sunrpc autofs via-rhine mii sg scsi_mod keybdev mousedev hid input usb-uhci usbcore ext3 jbd
Aug 8 04:32:24 xwing kernel: CPU: 0
Aug 8 04:32:24 xwing kernel: EIP: 0060:[<c0145119>] Not tainted
Aug 8 04:32:24 xwing kernel: EFLAGS: 00010206
Aug 8 04:32:24 xwing kernel:
Aug 8 04:32:24 xwing kernel: EIP is at get_unused_buffer_head [kernel] 0x49 (2.4.22-1.2197.nptl)
Aug 8 04:32:24 xwing kernel: eax: 6a63c0c0 ebx: 00000000 ecx: c3522000 edx: cfe25300
Aug 8 04:32:24 xwing kernel: esi: 00000000 edi: 00001000 ebp: 00000001 esp: dc64dccc
Aug 8 04:32:24 xwing kernel: ds: 0068 es: 0068 ss: 0068
Aug 8 04:32:24 xwing kernel: Process nfsd (pid: 2491, stackpage=dc64d000)
Aug 8 04:32:24 xwing kernel: Stack: c15abeac 000000f0 c01451b8 00000001 d800dd40 c15b3af4 00000341 c14b30b0
Aug 8 04:32:24 xwing kernel: dde0e840 dde0e840 c0145435 c14b30b0 00001000 00000001 c14b30b0 c14b30b0
Aug 8 04:32:24 xwing kernel: c01459b5 c14b30b0 00000341 00001000 0000001c 00000000 d9103000 dc64dd38
Aug 8 04:32:24 xwing kernel: Call Trace: [<c01451b8>] create_buffers [kernel] 0x28 (0xdc64dcd4)
Aug 8 04:32:24 xwing kernel: [<c0145435>] create_empty_buffers [kernel] 0x25 (0xdc64dcf4)
Aug 8 04:32:24 xwing kernel: [<c01459b5>] __block_prepare_write [kernel] 0x2d5 (0xdc64dd0c)
Aug 8 04:32:24 xwing kernel: [<de80d36b>] new_handle [jbd] 0x2b (0xdc64dd34)
Aug 8 04:32:24 xwing kernel: [<c0146169>] block_prepare_write [kernel] 0x39 (0xdc64dd50)
Aug 8 04:32:24 xwing kernel: [<de81f540>] ext3_get_block [ext3] 0x0 (0xdc64dd64)
Aug 8 04:32:24 xwing kernel: [<de81faf3>] ext3_prepare_write [ext3] 0xa3 (0xdc64dd70)
Aug 8 04:32:24 xwing kernel: [<de81f540>] ext3_get_block [ext3] 0x0 (0xdc64dd80)
Aug 8 04:32:24 xwing kernel: [<c0132985>] add_to_page_cache_unique [kernel] 0x45 (0xdc64dd8c)
Aug 8 04:32:24 xwing kernel: [<c0135a13>] do_generic_file_write [kernel] 0x223 (0xdc64dda0)
Aug 8 04:32:24 xwing kernel: [<c0135fe6>] generic_file_write [kernel] 0x136 (0xdc64ddf0)
Aug 8 04:32:24 xwing kernel: [<de81cfe9>] ext3_file_write [ext3] 0x39 (0xdc64de1c)
Aug 8 04:32:24 xwing kernel: [<de92dccf>] nfsd_write [nfsd] 0x14f (0xdc64de3c)
Aug 8 04:32:24 xwing kernel: [<c01184e0>] recalc_task_prio [kernel] 0x90 (0xdc64de84)
Aug 8 04:32:24 xwing kernel: [<de82be80>] ext3_file_operations [ext3] 0x0 (0xdc64dea4)
Aug 8 04:32:24 xwing kernel: [<de90b67f>] svc_sock_enqueue [sunrpc] 0x1bf (0xdc64df00)
Aug 8 04:32:24 xwing kernel: [<de933a28>] nfsd3_proc_write [nfsd] 0xa8 (0xdc64df14)
Aug 8 04:32:24 xwing kernel: [<de93bb3c>] nfsd_procedures3 [nfsd] 0xfc (0xdc64df40)
Aug 8 04:32:24 xwing kernel: [<de9295ce>] nfsd_dispatch [nfsd] 0xce (0xdc64df4c)
Aug 8 04:32:24 xwing kernel: [<de93b378>] nfsd_version3 [nfsd] 0x0 (0xdc64df60)
Aug 8 04:32:24 xwing kernel: [<de929500>] nfsd_dispatch [nfsd] 0x0 (0xdc64df64)
Aug 8 04:32:24 xwing kernel: [<de90b37f>] svc_process_R2466cc14 [sunrpc] 0x44f (0xdc64df68)
Aug 8 04:32:24 xwing kernel: [<de93bb3c>] nfsd_procedures3 [nfsd] 0xfc (0xdc64df88)
Aug 8 04:32:24 xwing kernel: [<de93b398>] nfsd_program [nfsd] 0x0 (0xdc64df8c)
Aug 8 04:32:24 xwing kernel: [<de9293a2>] nfsd [nfsd] 0x182 (0xdc64dfa8)
Aug 8 04:32:24 xwing kernel: [<de929220>] nfsd [nfsd] 0x0 (0xdc64dfe0)
Aug 8 04:32:24 xwing kernel: [<c010719d>] kernel_thread_helper [kernel] 0x5 (0xdc64dff0)
Aug 8 04:32:24 xwing kernel:
Aug 8 04:32:24 xwing kernel:
Aug 8 04:32:24 xwing kernel: Code: c7 40 04 ff ff ff ff c7 40 28 00 00 00 00 eb cd 8b 44 24 0c
This is all that shows up in /var/log/messages. This message is also echo'd at the terminal and it ends with...

Quote:
<0> kernel panic aiee killing interrupt handler. In interrupt handler - not syncing.
Note that I'm paraphrasing the above error because it didn't actually show in the logs, but it shows on the terminal.

My kernel is 2.4.22-1.2197 nptl and my OS is FC1.

Any ideas where to begin? Should I join the kernel mailing list and post it there? I hesitate to do that because I have done similar stuff in the past and unless you research your problem very well before posting, usually the list members will chew you a new arse, so I think I need to get my ducks in a row before I go posting on some list.

Thanks in advance.
 
Old 08-08-2004, 08:32 PM   #2
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,290

Rep: Reputation: 378Reputation: 378Reputation: 378Reputation: 378
From the paging request error message it looks like the kernel might be having trouble finding enough memory to complete the copy operation. This is particularly interesting since create_empty_buffers is the top entry on your call trace. How much RAM do you have on the machine and how much swap? Maybe you could try running a big rsync load and watch free and vmstat to see if your taxing your memory beyond what it can take.
 
Old 08-08-2004, 08:58 PM   #3
Donboy
Member
 
Registered: Aug 2003
Location: Little Rock, Arkansas
Distribution: RH, Fedora, Suse, AIX
Posts: 736

Original Poster
Rep: Reputation: 31
Hey! Excellent ideas! I will certainly try that.

Here's what I've got...

I have a single stick of 512MB PC-133 (32x8) 64X64. The item lists as "Generic Low Density". When I bootup, I get this in dmesg:

479MB LOWMEM available.
Memory: 481904k/491456k available (1482k kernel code, 9164k reserved, 1110k data, 136k init, 0k highmem)

Here's my "free" output...

Quote:
[root@xwing root]# free
total used free shared buffers cached
Mem: 482200 391104 91096 0 76456 220844
-/+ buffers/cache: 93804 388396
Swap: 979924 8 979916
As you can see, I've got a little under a gig of swap allocated and nearly none of it being used. I'm basically not doing much at all on the system. rsync is running on the sending machine and this machine is just doing the receiving, so it isn't working too hard. ;)

Here's my dmesg. Doesn't seem to be anything that looks bad (out of the ordinary) but maybe you'll see something I don't.

Quote:
Linux version 2.4.22-1.2115.nptl (bhcompile@bugs.devel.redhat.com) (gcc version 3.2.3 20030422 (Red Hat Linux 3.2.3-6)) #1 Wed Oct 29 15:31:21 EST 2003
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000cc000 - 00000000000d0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001dff0000 (usable)
BIOS-e820: 000000001dff0000 - 000000001dff8000 (ACPI data)
BIOS-e820: 000000001dff8000 - 000000001e000000 (ACPI NVS)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
0MB HIGHMEM available.
479MB LOWMEM available.
ACPI: have wakeup address 0xc0001000
On node 0 totalpages: 122864
zone(0): 4096 pages.
zone(1): 118768 pages.
zone(2): 0 pages.
ACPI: RSDP (v000 AMI ) @ 0x000fafe0
ACPI: RSDT (v001 AMIINT VIA_K7 0x00000010 MSFT 0x00000097) @ 0x1dff0000
ACPI: FADT (v001 AMIINT VIA_K7 0x00000011 MSFT 0x00000097) @ 0x1dff0030
ACPI: MADT (v001 AMIINT VIA_K7 0x00000009 MSFT 0x00000097) @ 0x1dff00c0
ACPI: DSDT (v001 VIA VIA_K7 0x00001000 MSFT 0x0100000d) @ 0x00000000
Kernel command line: ro root=LABEL=/
Initializing CPU#0
Detected 1470.042 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 2929.45 BogoMIPS
Memory: 481904k/491456k available (1482k kernel code, 9164k reserved, 1110k data, 136k init, 0k highmem)
Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
Inode cache hash table entries: 32768 (order: 6, 262144 bytes)
Mount cache hash table entries: 512 (order: 0, 4096 bytes)
Buffer cache hash table entries: 32768 (order: 5, 131072 bytes)
Page-cache hash table entries: 131072 (order: 7, 524288 bytes)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: 0383fbff c1cbfbff 00000000 00000000
CPU: Common caps: 0383fbff c1cbfbff 00000000 00000000
CPU: AMD Athlon(tm) XP 1700+ stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au)
mtrr: detected mtrr type: Intel
ACPI: Subsystem revision 20031002
ACPI: Interpreter disabled.
PCI: PCI BIOS revision 2.10 entry at 0xfdb41, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Using IRQ router VIA [1106/0686] at 00:07.0
Applying VIA southbridge workaround.
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS not found.
Starting kswapd
VFS: Disk quotas vdquot_6.5.1
pty: 2048 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ SERIAL_PCI ISAPNP enabled
Real Time Clock Driver v1.10e
NET4: Frame Diverter 0.46
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 00:07.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, hdd:pio
hda: IC35L060AVV207-0, ATA DISK drive
hdb: WDC WD1600JB-00EVA0, ATA DISK drive
blk: queue c0408880, I/O limit 4095Mb (mask 0xffffffff)
blk: queue c04089c0, I/O limit 4095Mb (mask 0xffffffff)
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: attached ide-disk driver.
hda: host protected area => 1
hda: 120103200 sectors (61493 MB) w/1821KiB Cache, CHS=7476/255/63, UDMA(100)
hdb: attached ide-disk driver.
hdb: host protected area => 1
hdb: 312581808 sectors (160042 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(100)
Partition check:
hda: hda1 hda2 hda3 hda4 < hda5 >
hdb: hdb1
ide: late registration of driver.
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
Initializing Cryptographic API
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 65536)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 158k freed
VFS: Mounted root (ext2 filesystem).
Journalled Block Device driver loaded
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
Freeing unused kernel memory: 136k freed
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
usb-uhci.c: $Revision: 1.275 $ time 15:37:48 Oct 29 2003
usb-uhci.c: High bandwidth mode enabled
PCI: Found IRQ 10 for device 00:07.3
PCI: Sharing IRQ 10 with 00:07.2
usb-uhci.c: USB UHCI at I/O 0xec00, IRQ 10
usb-uhci.c: Detected 2 ports
usb.c: new USB bus registered, assigned bus number 1
hub.c: USB hub found
hub.c: 2 ports detected
PCI: Found IRQ 10 for device 00:07.2
PCI: Sharing IRQ 10 with 00:07.3
usb-uhci.c: USB UHCI at I/O 0xe800, IRQ 10
usb-uhci.c: Detected 2 ports
usb.c: new USB bus registered, assigned bus number 2
hub.c: USB hub found
hub.c: 2 ports detected
usb-uhci.c: v1.275:USB Universal Host Controller Interface driver
usb.c: registered new driver hiddev
usb.c: registered new driver hid
hid-core.c: v1.8.1 Andreas Gal, Vojtech Pavlik <vojtech@suse.cz>
hid-core.c: USB HID support drivers
mice: PS/2 mouse device common for all mice
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,2), internal journal
Adding Swap: 979924k swap-space (priority -1)
kjournald starting. Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,3), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,65), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Thanks for the ideas, and I will be happy to post anything you think may be of help.

Last edited by Donboy; 08-08-2004 at 09:00 PM.
 
Old 08-08-2004, 09:39 PM   #4
amosf
Senior Member
 
Registered: Jun 2004
Location: Australia
Distribution: Mandriva/Slack - KDE
Posts: 1,672

Rep: Reputation: 46
Sheesh that was confusing. I thought this was a listing from my own box for a minute and got toatally confufuse... almost same VIA chipset (I have 686a) and my drives are:

hda: IC35L040AVVN07-0, ATA DISK drive
hdb: WDC WD800JB-00FMA0, ATA DISK drive

almost the same as well!!! Well, at a quick glance

Very confusiong for me Sorry, nothing on the problem tho Have you ever had any trouble with the IBM drive? I've never fully trusted mine and it makes some funny noises and seems to reset itself when it's working a lot.... Probably unrelated to this problem tho...
 
Old 08-08-2004, 10:10 PM   #5
Donboy
Member
 
Registered: Aug 2003
Location: Little Rock, Arkansas
Distribution: RH, Fedora, Suse, AIX
Posts: 736

Original Poster
Rep: Reputation: 31
Well, I don't know. You see, originally this drive came as a USB device, but I got sick of running like that and took the drive out of the case (rather easily too) and mounted it as a slave in one of my machines for a long time. During that time it was used as a backup drive and its worked pretty good for that. However, now I have removed all the backups, formatted to ext3 and loaded an OS and now it's my master drive in another machine. So I really haven't had that much experience with this drive while running an OS on it, but I can tell you that mine has never made any noises, and in fact runs pretty quiet.

The 160gig I picked up at Sams for a decent price (I think, anyway) and it's doing fine also, but again, using it as a backup drive, so don't know how well it would perform with an OS on it.

My motherboard is a stanky old ASRock mobo that carried an Athlon chip. I say stanky because it doesn't have an AGP slot and has only 2 measly PCI slots, but thankfully there is a lot of stuff on the board itself like sound and video. For a backup server it's more than adequate and didn't cost me a lot. But anyway, I digress.

Amazing coincidence.
 
Old 08-08-2004, 10:36 PM   #6
amosf
Senior Member
 
Registered: Jun 2004
Location: Australia
Distribution: Mandriva/Slack - KDE
Posts: 1,672

Rep: Reputation: 46
I have OS spread over both and use heavy (tho the WD is new) and can't really complain. Just nerves maybe as I got the IBM for free (sealed in a bag tho). It runs quiet too, except on rare occassions it goes CHWEEEAPWEEEAP-CLICK CHWEEEAPWEEEAP-CLICK ... then quiet again.

I hold my breath a moment, then continue on working

Anyway, better get back to the real issue here...
 
Old 08-15-2004, 02:50 PM   #7
Donboy
Member
 
Registered: Aug 2003
Location: Little Rock, Arkansas
Distribution: RH, Fedora, Suse, AIX
Posts: 736

Original Poster
Rep: Reputation: 31
Just wanted to give an update on my problem here.

I rearranged my backups from "push" type to "pull" type. So now everything seems to be OK, as far as I can tell.

Now I am doing it so that the source computers (containing the data that needs to be backed up) are being exported through /etc/exports and the backup machine is pulling the data from the remote machines to the local machine.

During the runs, the destination machine runs about 80% CPU and never uses more than the RAM I have installed. It uses all the RAM I have available but never taps into the swap space... it always remains zero. On the source machines, there are about 6 or 8 NFS daemons that are using zero memory but each of them uses about less than 1% CPU.

No more kernel errors, but I'm going to give it another week and see what happens.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Kernel Panic: Aiee, Killing Interrupt Handler lalarosa Linux - Newbie 1 06-06-2006 09:25 AM
<0>Kernel panic: Aiee, killing interrupt handler! In interrupt handler - not syncing mrb Linux - Newbie 2 01-09-2005 10:47 AM
Kernel panic: Aiee, killing interrupt handler! gxil Linux - Newbie 4 10-20-2003 01:23 PM
Kernel panic: Aiee, killing interrupt handler! vinhhv Linux - Newbie 7 09-25-2003 11:21 PM
kernel panic Aiee, killing interrupt handler! c0c0deuz Linux - General 4 08-16-2003 06:40 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 10:58 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration