I'm having a problem with my wife's machine. She recently lost a hard drive, which I replaced. I also replaced her mainboard/CPU/RAM in the process, with a board/chip that came out of my machine a week before. It's an 1.2 GHz Athlon with 768 MB of RAM. I've been using the same hardware (except the hard drive) in another box, and I have 10 other Debian unstable machines running in the house, so I'm not sure whether the problem is hardware or software, but I am leaning more toward the hardware. The LSPCI output is:
Code:
0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 81)
0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
0000:00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40)
0000:00:07.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
0000:00:07.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1a)
0000:00:07.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1a)
0000:00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
0000:00:11.0 Multimedia audio controller: ESS Technology ES1978 Maestro 2E (rev 10)
0000:00:12.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
0000:00:14.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 02)
0000:00:14.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 02)
0000:01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 400] (rev a1)
Drives from dmesg:
Code:
hda: IBM-DTTA-371010, ATA DISK drive
hdb: Hewlett-Packard CD-Writer Plus 9100, ATAPI CD/DVD-ROM drive
hdc: ST328040A, ATA DISK drive
hdd: HITACHI GD-2000, ATAPI CD/DVD-ROM drive
Since I lost the drive, I reinstalled using the Debian etch beta netinst cd. Install went fine, added packages, installed the right processor version of the kernel:
Code:
Linux sutherland 2.6.16-1-k7 #2 Thu May 4 18:35:10 UTC 2006 i686 GNU/Linux
Installed the same package list that she was using before (Debian unstable). When I noticed that the system was having problems, I did a package upgrade.
In the last week since doing this, she has had the machine lock up or behave strangely consistently. A few examples...
- In Firefox, at times (like during text entry in search boxes), the text will change from uppercase to lowercaseless at random.
- She was sitting on a page in Firefox, and the browser executed a "back" without any input from her.
- I rebooted the other night, and when I logged in to her account (running KDE), the machine locked hard. I was unable to even <Ctrl><alt><F1> to get to a console. I had to reboot again.
- The printer, a USB HP PSC 1315v, works sometimes, and other thimes it does not.
- Starting Kontact causes a crash message from KDE, and sometimes locks up the system.
I did manage to capture two kernel panics. The
Code:
May 17 23:35:33 sutherland kernel: ------------[ cut here ]------------
May 17 23:35:33 sutherland kernel: kernel BUG at lib/prio_tree.c:149!
May 17 23:35:33 sutherland kernel: invalid opcode: 0000 [#1]
May
comm
otplug via_agp agpgart parport_pc parport ext3 jbd mbcache ide_cd cdrom ide_disk 3c59x mii uhci_hcd usbcore via82cxxx generic ide_core processor
May 17 23:35:33 sutherland kernel: CPU: 0
May 17 23:35:33 sutherland kernel: EIP: 0060:[prio_tree_replace+32/97] Tainted: P VLI
May 17 23:35:33 sutherland kernel: EFLAGS: 00010203 (2.6.16-1-k7 #2)
May 17 23:35:33 sutherland kernel: EIP is at prio_tree_replace+0x20/0x61
May 17 23:35:33 sutherland kernel: eax: b1968884 ebx: bd6002cc ecx: c9579514 edx: b1968884
May 17 23:35:33 sutherland kernel: esi: b196885c edi: c95794ec ebp: bd6002cc esp: bad0bdec
May 17 23:35:33 sutherland kernel: ds: 007b es: 007b ss: 0068
May 17 23:35:33 sutherland kernel: Process gaim (pid: 16611, threadinfo=bad0a000 task=ce793050)
May 17 23:35:33 sutherland kernel: Stack: <0>00000000 b0137761 bd6002cc b1968884 c9579514 b196885c b196885c b1968804
May 17 23:35:33 sutherland kernel: a68f7000 b013b2e6 b196885c bd6002cc c1de68b4 b0138ab8 b196885c b196885c
May 17 23:35:33 sutherland kernel: bad0be6c c1de64ec c42aa3e0 bad0bec0 b013b60c bad0be6c c1de64ec 00000000
May 17 23:35:33 sutherland kernel: Call Trace:
May 17 23:35:33 sutherland kernel: [vma_prio_tree_remove+129/192] vma_prio_tree_remove+0x81/0xc0
May 17 23:35:33 sutherland kernel: [__remove_shared_vm_struct+68/72] __remove_shared_vm_struct+0x44/0x48
May 17 23:35:33 sutherland kernel: [free_pgtables+34/108] free_pgtables+0x22/0x6c
May 17 23:35:33 sutherland kernel: [exit_mmap+102/179] exit_mmap+0x66/0xb3
May 17 23:35:33 sutherland kernel: [mmput+28/96] mmput+0x1c/0x60
May 17 23:35:33 sutherland kernel: [exit_mm+183/188] exit_mm+0xb7/0xbc
May 17 23:35:33 sutherland kernel: [do_exit+392/1597] do_exit+0x188/0x63d
May 17 23:35:33 sutherland kernel: [sys_exit_group+0/17] sys_exit_group+0x0/0x11
May 17 23:35:33 sutherland kernel: [get_signal_to_deliver+844/860] get_signal_to_deliver+0x34c/0x35c
May 17 23:35:33 sutherland kernel: [do_notify_resume+138/1475] do_notify_resume+0x8a/0x5c3
May 17 23:35:33 sutherland kernel: [sigprocmask+127/148] sigprocmask+0x7f/0x94
May 17 23:35:33 sutherland kernel: [sys_rt_sigprocmask+71/154] sys_rt_sigprocmask+0x47/0x9a
May 17 23:35:33 sutherland kernel: [work_notifysig+19/25] work_notifysig+0x13/0x19
May 17 23:35:33 sutherland kernel: Code: 80 8b 03 eb 02 31 c0 5b 5e 5f c3 53 8b 54 24 0c 8b 4c 24 10 8b 5c 24 08 89 49 08 8b 42 08 89 49 04 89 09 39 d0 75 13 39 13 74 08 <0f> 0b 95 00 1b db 27 b0 89 49 08 89 0b eb 11 89 41 08 8b 42 08
May 17 23:35:33 sutherland kernel: <1>Fixing recursive fault but reboot is needed!
Code:
May 18 01:49:03 sutherland kernel: Unable to handle kernel paging request at virtual address fffe0004
May 18 01:49:03 sutherland kernel: printing eip:
May 18 01:49:03 sutherland kernel: b0144438
May 18 01:49:03 sutherland kernel: *pde = 00002067
May 18 01:49:03 sutherland kernel: *pte = 00000000
May 18 01:49:03 sutherland kernel: Oops: 0002 [#2]
May
comm
otplug via_agp agpgart parport_pc parport ext3 jbd mbcache ide_cd cdrom ide_disk 3c59x mii uhci_hcd usbcore via82cxxx generic ide_core processor
May 18 01:49:03 sutherland kernel: CPU: 0
May 18 01:49:03 sutherland kernel: EIP: 0060:[cache_alloc_refill+305/1004] Tainted: P VLI
May 18 01:49:03 sutherland kernel: EFLAGS: 00010046 (2.6.16-1-k7 #2)
May 18 01:49:03 sutherland kernel: EIP is at cache_alloc_refill+0x131/0x3ec
May 18 01:49:03 sutherland kernel: eax: dfffdce0 ebx: ffffffff ecx: dffffc00 edx: fffe0000
May 18 01:49:03 sutherland kernel: esi: b03b7000 edi: dfffdce0 ebp: dfff9200 esp: dd9dde28
May 18 01:49:03 sutherland kernel: ds: 007b es: 007b ss: 0068
May 18 01:49:03 sutherland kernel: Process ud (pid: 6445, threadinfo=dd9dc000 task=df172ab0)
May 18 01:49:03 sutherland kernel: Stack: <0>00000022 00000050 dffffc00 ffffffff 00001478 b13de760 b13de760 ffffffff
May 18 01:49:03 sutherland kernel: 00000050 dffffc00 00000246 00000000 00001000 b01442fe b13de760 00000000
May 18 01:49:03 sutherland kernel: b0147dd5 dffffc00 00000050 b13de760 b014919b 00000050 b13de760 00001000
May 18 01:49:03 sutherland kernel: Call Trace:
May 18 01:49:03 sutherland kernel: [kmem_cache_alloc+44/53] kmem_cache_alloc+0x2c/0x35
May 18 01:49:03 sutherland kernel: [alloc_buffer_head+16/39] alloc_buffer_head+0x10/0x27
May 18 01:49:03 sutherland kernel: [alloc_page_buffers+24/162] alloc_page_buffers+0x18/0xa2
May 18 01:49:03 sutherland kernel: [__getblk+330/448] __getblk+0x14a/0x1c0
May 18 01:49:03 sutherland kernel: [pg0+813744024/1338717184] do_journal_end+0x404/0xac4 [reiserfs]
May 18 01:49:03 sutherland kernel: [pagevec_lookup_tag+30/37] pagevec_lookup_tag+0x1e/0x25
After the package upgrade, the lockups still occurred, but the panics do not go to the logs. I was logged in remotely from my laptop, and saw the following today in a wall message:
Code:
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: Oops: 0000 [#1]
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: CPU: 0
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: EIP is at __find_get_block_slow+0x6c/0xed
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: eax: 00000000 ebx: fffdfffe ecx: 00000001 edx: ffffffff
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: esi: 0000800b edi: 00000000 ebp: b1132c60 esp: c3e11a60
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: ds: 007b es: 007b ss: 0068
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: Process kontact (pid: 7972, threadinfo=c3e10000 task=c8ec3ab0)
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: Stack: <0>de418b24 b98eff40 cf1cb9c4 00108003 00000000 00000008 b0149fae df864c00
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: c3e11e38 cf4cde3c b0149fa3 00000000 00000000 00000000 00000000 00000020
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: 00001000 0000800b df864c00 b0149fdc de418ac0 0000800b 00000000 00001000
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: Call Trace:
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [__find_get_block+297/317] __find_get_block+0x129/0x13d
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [__find_get_block+286/317] __find_get_block+0x11e/0x13d
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [__getblk+26/448] __getblk+0x1a/0x1c0
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [pg0+813802999/1338717184] search_by_key+0x78/0xd78 [reiserfs]
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [pg0+813737607/1338717184] reiserfs_update_sd_size+0x64/0x25e [reiserfs]
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [pg0+813725351/1338717184] reiserfs_rename+0x757/0x878 [reiserfs]
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [ll_rw_block+127/142] ll_rw_block+0x7f/0x8e
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [pg0+813803282/1338717184] search_by_key+0x193/0xd78 [reiserfs]
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [mntput_no_expire+20/96] mntput_no_expire+0x14/0x60
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [vfs_rename+663/960] vfs_rename+0x297/0x3c0
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [sys_renameat+347/460] sys_renameat+0x15b/0x1cc
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [sys_faccessat+146/306] sys_faccessat+0x92/0x132
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [handle_IRQ_event+32/76] handle_IRQ_event+0x20/0x4c
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [__do_IRQ+101/145] __do_IRQ+0x65/0x91
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [sys_rename+17/21] sys_rename+0x11/0x15
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [syscall_call+7/11] syscall_call+0x7/0xb
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: Code: 9f 00 00 00 8b 00 f6 c4 08 0f 84 8b 00 00 00 8b 45 00 f6 c4 08 75 08 0f 0b 9d 01 42 7d 27 b0 8b 5d 0c b9 01 00 00 00 89 5c 24 04 <8b> 53 18 8b 43 14 39 fa 75 04 39 f0 74 5c 8b 03 8b 5b 04 a8 20
Message from syslogd@sutherland at Sat May 20 15:33:35 2006 ...
sutherland kernel: [pg0+813803171/1338717184] search_by_key+0x124/0xd78 [reiserfs]
Can anyone help me track this down?
Thanks,
--Storm