LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Having some weird lockups... (https://www.linuxquestions.org/questions/linux-software-2/having-some-weird-lockups-741279/)

mike105105 07-19-2009 09:37 PM

Having some weird lockups...
 
I am wondering if anyone can point me in the right direction, lately I have been having random system lockups. Many times when it happens I cant get anything to run, if I drop out to a terminal it just sits waiting after I enter my password. I can usually use the alt-sysrq codes to shutdown, but sometimes I have had the system locked hard. This system has only started doing this about a month or 2 ago. I ran memtest and came up with no errors, reseated everything on the mobo, and the power supply is a fairly new ABS tagan modular unit. I have had things pop up in my log recently like this:
Code:

[  304.318640] BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
[  304.318644] IP: [<ffffffff802b04e7>] find_get_page+0x37/0x80
[  304.318651] PGD 1a3841067 PUD 1a3840067 PMD 0
[  304.318653] Oops: 0000 [#13] SMP
[  304.318655] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:05:00.0/resource
[  304.318657] CPU 2
[  304.318659] Modules linked in: binfmt_misc bridge stp bnep vboxnetflt vboxdrv vmnet ppdev parport_pc vmblock vmci vmmon video output dm_crypt f71882fg coretemp hwmon_vid lp parport saa7134_alsa mt20xx snd_hda_codec_realtek tea5767 snd_hda_intel tda9887 snd_hda_codec tda8290 snd_hwdep snd_pcm_oss snd_mixer_oss tuner snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse saa7134 snd ir_common serio_raw iTCO_wdt iTCO_vendor_support intel_agp v4l2_common videobuf_dma_sg videobuf_core tveeprom soundcore snd_page_alloc pcspkr nvidia(P) gspca_zc3xx gspca_main videodev v4l1_compat v4l2_compat_ioctl32 usbhid reiserfs r8169 mii floppy
[  304.318693] Pid: 6799, comm: gdmgreeter Tainted: P      D    2.6.29-02062906-generic #02062906 MS-7345
[  304.318695] RIP: 0010:[<ffffffff802b04e7>]  [<ffffffff802b04e7>] find_get_page+0x37/0x80
[  304.318697] RSP: 0000:ffff8801a2db1d08  EFLAGS: 00010203
[  304.318699] RAX: 00000000000007ff RBX: ffff8801ac206040 RCX: 0000000000000000
[  304.318700] RDX: 0000000000000800 RSI: 0000000000000800 RDI: 0000000000000800
[  304.318702] RBP: ffff8801a2db1d18 R08: ffff8801ac262670 R09: 0000000000000000
[  304.318703] R10: 000000000000003c R11: 00007ff064a615b0 R12: 0000000000000065
[  304.318704] R13: 0000000000000065 R14: ffff8801ac206038 R15: ffff8801a2cf1b28
[  304.318706] FS:  00007ff06b356780(0000) GS:ffff8801af802d00(0000) knlGS:0000000000000000
[  304.318708] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  304.318709] CR2: 0000000000000808 CR3: 00000001a390e000 CR4: 00000000000006a0
[  304.318711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  304.318712] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  304.318714] Process gdmgreeter (pid: 6799, threadinfo ffff8801a2db0000, task ffff8801a5519620)
[  304.318715] Stack:
[  304.318717]  0000000000000000 ffff8801ac206038 ffff8801a2db1d48 ffffffff802b0735
[  304.318719]  ffff8801a2db1dd8 0000000000000000 ffff8801a4475240 ffff8801a2db1e18
[  304.318722]  ffff8801a2db1db8 ffffffff802b1cab ffff8801a2db1db8 ffffffff80a1caa0
[  304.318725] Call Trace:
[  304.318726]  [<ffffffff802b0735>] find_lock_page+0x25/0x70
[  304.318729]  [<ffffffff802b1cab>] filemap_fault+0x8b/0x330
[  304.318731]  [<ffffffff802c9f09>] __do_fault+0x59/0x580
[  304.318734]  [<ffffffff802ca50c>] do_linear_fault+0x3c/0x40
[  304.318736]  [<ffffffff802ca6a3>] handle_mm_fault+0x193/0x390
[  304.318738]  [<ffffffff8068bcb9>] do_page_fault+0x329/0x540
[  304.318741]  [<ffffffff806896d5>] page_fault+0x25/0x30
[  304.318744] Code: 5f 08 4c 89 e6 48 89 df e8 97 0b 16 00 48 85 c0 49 89 c0 74 4a 48 8b 38 40 f6 c7 01 75 e4 48 8d 47 ff 48 89 fe 48 83 f8 fd 77 d7 <8b> 4f 08 4c 8d 4f 08 85 c9 74 cc 8d 41 01 48 63 d1 4c 63 d0 48
[  304.318767] RIP  [<ffffffff802b04e7>] find_get_page+0x37/0x80
[  304.318769]  RSP <ffff8801a2db1d08>
[  304.318770] CR2: 0000000000000808
[  304.318772] ---[ end trace f21ec468aeba38f5 ]---

Any help would be awesome.

Thanks,
Mike

x_terminat_or_3 07-20-2009 01:25 AM

I'm not a kernel expert, but it does look like one of your PCI devices is either broken, or is having driver issues.

Kindly do a
Code:

lspci
and paste the output here so we can see what device exactly is having issues.

mike105105 07-20-2009 08:31 AM

Code:

00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 02)
00:01.0 PCI bridge: Intel Corporation 82G33/G31/P35/P31 Express PCI Express Root Port (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IR (ICH9R) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation G94 [GeForce 9600 GT] (rev a1)
03:00.0 IDE interface: Marvell Technology Group Ltd. 88SE6121 SATA II Controller (rev b2)
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
05:00.0 Multimedia controller: Philips Semiconductors SAA7131/SAA7133/SAA7135 Video Broadcast Decoder (rev 10)

Thanks,
Mike

x_terminat_or_3 07-20-2009 12:34 PM

From what I can see in your kernel panic, it seems that it was currently executing the program gdmgreeter. Which is a part of gnome's X-Server login screen.

Tell me, when this happens, does it happen to occur immediately after it tries to start X?

Also, the PCI device mentioned in the kernel panic is your
Quote:

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
, which is part of your South-Bridge IIRC.

Does your chipset have a fan on it, and is it running? Perhaps it overheated, or got otherwise damaged due to a power surge?

mike105105 07-20-2009 01:34 PM

Well that time it was happening on gdmgreeter, other times it was other apps. My mobo is an MSI P35 neo2-fr, it has integrated heatpipes going to the nothbridge and southbridge, and it is all in an antec 900 case with the side fan installed and blowing at the mobo....so unfortunately I think my problem is deeper than a heat issue :-/



Thanks,
Mike

x_terminat_or_3 07-20-2009 03:10 PM

The usual way to find out if it is a hardware issue is to rip out your PCI cards one by one until your system is stable again, in the case of your PCI bridge, it won't be that easy ripping it out. . .


All times are GMT -5. The time now is 11:23 AM.