LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 06-21-2010, 10:59 AM   #1
ternarybit
Member
 
Registered: Jun 2009
Distribution: Debian, Arch, Mint
Posts: 51

Rep: Reputation: 15
Debian: Gnome keeps locking -- what would you do in my shoes?


I have posted here a few times about my troubles with Gnome hard locking. I started on Ubuntu 9.04 and, with the advice of a helpful LQ expert, migrated to Debian Lenny.

Hardware summary:
  • Asus P5W DH Deluxe, recently replaced via Asus RMA
  • Intel Core2 Quad Q9550 @ stock 2.83GHz, cooled on Zalman CNPS9700
  • Radeon HD 4890 graphics
  • 2x Western Digital 250GB HDDs, both passed DLG full media scan
  • 4GB OCZ Reaper DDR2 @ PC2-5300
  • Logitech MX1000 laser mouse
  • Antec Neo HE 500W PSU

Problem symptoms and background:
Gnome hard locks randomly, regardless of what task I'm performing. It has locked when I was building ffmpeg from source, when I was converting jpegs with imagemagick, and a dozen other random times.

It does not lock consistently (as in every N minutes/hours/days). The average mean time between locks is about 2 weeks, but I went a full 5 weeks before my second to last lock.

I have terminal bound to a key combination, which oftentimes will come up during the hard lock. If I can get to a shell, I can usually issue a /etc/init.d/gdm restart or reboot command successfully. Sometimes I can't, and I have to power off the machine manually, which almost always results in serious filesystem corruption.

Other circumstances:
I run 4 RAID1 disks with mdadm:
  • md0 is /
  • md1 is /boot
  • md2 is /opt
  • md3 is /home

These were set up in Debian setup and I have not modified default settings.

Troubleshooting I have done already:
  1. Migrated from Ubuntu 9.04 to Debian Stable (Lenny) 5
  2. Run memtest for 12+ hours with no errors
  3. Run full media scan from Western Digital Data Life Guard 5 on both drives with no errors
  4. Exchanged my motherboard through Asus RMA & reseated my CPU heatsink properly with Arctic Silver 5 TIM
  5. Tweaked my PC's environment to ensure adequate cooling (higher RPM fans, etc.)
  6. Reinstalled Debian probably 5 times as a result of filesystem corruption from needing to hard restart

My simple, humble question to you all is: what would you do if you were in my situation?

At this point I suspect maybe a faltering PSU or the outside possibility that a malfunctioning peripheral may cause some kind of interference, but at this point those guesses are as wild as any. The only peripherals I run are a keyboard & mouse. How would you go about troubleshooting this situation? What's the next step?

As always, thank you very much for your time and expertise. Cheers!

-Austin
 
Old 06-21-2010, 12:04 PM   #2
pljvaldez
LQ Guru
 
Registered: Dec 2005
Location: Somewhere on the String
Distribution: Debian Wheezy (x86)
Posts: 6,094

Rep: Reputation: 281Reputation: 281Reputation: 281
I would try some software things first. First try installing another desktop environment (XFCE, LXDE, or KDE) and see if that improves your situation. Just log out and select the new desktop from the session menu. Perhaps some gnome applet is causing things to hang. By the way, is there anything in your syslog around the lockup times?

You could also try updating to Squeeze. It's going to be stable sometime around the end of the year or early next year. It's really a pretty stable OS in itself.

Hardware wise, I'd try the PSU if it's not too much trouble. Make sure to get one with enough capacity in both the 5VDC and 12VDC sides of the supply. My friend had a problem that turned out he was drawing too much power on on side of the supply, even though he was below the 300W total capacity of the supply.
 
Old 06-21-2010, 12:22 PM   #3
ternarybit
Member
 
Registered: Jun 2009
Distribution: Debian, Arch, Mint
Posts: 51

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by pljvaldez View Post
I would try some software things first. First try installing another desktop environment (XFCE, LXDE, or KDE) and see if that improves your situation. Just log out and select the new desktop from the session menu. Perhaps some gnome applet is causing things to hang. By the way, is there anything in your syslog around the lockup times?

You could also try updating to Squeeze. It's going to be stable sometime around the end of the year or early next year. It's really a pretty stable OS in itself.

Hardware wise, I'd try the PSU if it's not too much trouble. Make sure to get one with enough capacity in both the 5VDC and 12VDC sides of the supply. My friend had a problem that turned out he was drawing too much power on on side of the supply, even though he was below the 300W total capacity of the supply.
Hey, thanks for the reply!

I may just go give the new KDE a whirl and see if it has the same issue. I did not notice anything in the syslog last time it happened, but I will be sure to check next time. I know that several times when Gnome locked, the kernel threw an error and Debian asked if I wanted to submit a report to kernel.org. Gnome locked shortly thereafter.

Is there a way to update from Lenny to Squeeze in place? Or must it be done from format/reinstall?

My 500W PSU is definitely taxed to its limits with the HD4890 and 2 HDDs. That is currently my biggest suspicion. An adequate PC Power & Cooling PSU runs upwards of $150 USD so I'm not about to just jump into that without trying everything else first.

Thanks for the input, much appreciated!

-Austin
 
Old 06-21-2010, 03:12 PM   #4
pljvaldez
LQ Guru
 
Registered: Dec 2005
Location: Somewhere on the String
Distribution: Debian Wheezy (x86)
Posts: 6,094

Rep: Reputation: 281Reputation: 281Reputation: 281
You can upgrade pretty easily. Edit the file /etc/apt/sources.list and every place it says "stable" or "lenny" change that word to "squeeze". Then run (as root)
Code:
aptitude update
aptitude install apt dpkg aptitude
aptitude full-upgrade
Be aware that squeeze is in active development, and it is possible that some current package state causes a failure of upgrade. But in the past, I've never had trouble doing an upgrade like this. But make sure you have good backups just in case. Dist upgrades pretty much work flawlessly from oldstable to new stable releases, but squeeze won't become stable for another 6 months at least...

Last edited by pljvaldez; 06-21-2010 at 03:22 PM.
 
Old 06-23-2010, 11:56 AM   #5
ternarybit
Member
 
Registered: Jun 2009
Distribution: Debian, Arch, Mint
Posts: 51

Original Poster
Rep: Reputation: 15
Kernel Oops

OK, so my kernel just threw an "oops" but it didn't result in any hard locking or other problems as far as I can tell. The last thing I did before it threw the oops was install rsnapshot from the lenny repos. Here's the syslog:

Code:
Jun 23 09:47:21 lyssa kernel: [ 3584.509850] dpkg[5979] general protection ip:40afb0 sp:7fff71bc8968 error:0 in dpkg[400000+61000]
Jun 23 09:47:21 lyssa kernel: [ 3584.509909] CPU 2 
Jun 23 09:47:21 lyssa kernel: [ 3584.509911] Modules linked in: michael_mic arc4 ecb crypto_blkcipher fglrx(P) appletalk nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ppdev parport_pc lp parport ipv6 cpufreq_conservative cpufreq_userspace cpufreq_stats cpufreq_ondemand freq_table cpufreq_powersave fuse coretemp w83627ehf hwmon_vid vboxdrv sbp2 loop snd_hda_intel ieee80211_crypt_tkip snd_seq snd_seq_device wl(P) snd_pcm i2c_i801 i82975x_edac i2c_core rng_core button snd_timer serio_raw edac_core snd pcspkr joydev soundcore ieee80211_crypt psmouse snd_page_alloc evdev ext3 jbd mbcache raid1 md_mod sg sr_mod sd_mod cdrom ata_piix ata_generic usb_storage usbhid hid ff_memless ohci1394 floppy ieee1394 ide_pci_generic piix ahci ehci_hcd libata scsi_mod dock uhci_hcd jmicron ide_core thermal processor fan thermal_sys [last unloaded: scsi_wait_scan]
Jun 23 09:47:21 lyssa kernel: [ 3584.509946] Pid: 5979, comm: dpkg Tainted: P          2.6.26-2-amd64 #1
Jun 23 09:47:21 lyssa kernel: [ 3584.509947] RIP: 0010:[<ffffffff80287669>]  [<ffffffff80287669>] page_remove_rmap+0xff/0x11a
Jun 23 09:47:21 lyssa kernel: [ 3584.509952] RSP: 0000:ffff81009d415be8  EFLAGS: 00010246
Jun 23 09:47:21 lyssa kernel: [ 3584.509953] RAX: 0000000000000000 RBX: ffffe200029369d0 RCX: 000000000000d9e7
Jun 23 09:47:21 lyssa kernel: [ 3584.509955] RDX: ffff810080a60000 RSI: 0000000000000046 RDI: 0000000000000286
Jun 23 09:47:21 lyssa kernel: [ 3584.509957] RBP: ffff81011c54a4a8 R08: 0000000000674000 R09: ffff81009d415600
Jun 23 09:47:21 lyssa kernel: [ 3584.509958] R10: 0000000000000000 R11: 0000000000000010 R12: ffff81013cdd6c40
Jun 23 09:47:21 lyssa kernel: [ 3584.509960] R13: 0000000000674000 R14: ffffe200029369d0 R15: ffff810001037b80
Jun 23 09:47:21 lyssa kernel: [ 3584.509961] FS:  0000000000000000(0000) GS:ffff81013fa9f0c0(0000) knlGS:0000000000000000
Jun 23 09:47:21 lyssa kernel: [ 3584.509963] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 23 09:47:21 lyssa kernel: [ 3584.509964] CR2: 00000000031fa888 CR3: 000000009d052000 CR4: 00000000000006e0
Jun 23 09:47:21 lyssa kernel: [ 3584.509966] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 23 09:47:21 lyssa kernel: [ 3584.509967] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 23 09:47:21 lyssa kernel: [ 3584.509969] Process dpkg (pid: 5979, threadinfo ffff81009d414000, task ffff81013f1dc180)
Jun 23 09:47:21 lyssa kernel: [ 3584.509970] Stack:  00000000bc676065 00000000bc676065 ffff81009d4a83a0 ffffffff8027f6be
Jun 23 09:47:21 lyssa kernel: [ 3584.509974]  ffff81013cdd6f18 0000000000000000 ffff81009d415cf8 ffffffffffffffff
Jun 23 09:47:21 lyssa kernel: [ 3584.509976]  0000000000000000 ffff81011c54a4a8 ffff81009d415d00 00000000003b5fd4
Jun 23 09:47:21 lyssa kernel: [ 3584.509978] Call Trace:
Jun 23 09:47:21 lyssa kernel: [ 3584.509981]  [<ffffffff8027f6be>] ? unmap_vmas+0x4c9/0x885
Jun 23 09:47:21 lyssa kernel: [ 3584.509992]  [<ffffffff80283ae9>] ? exit_mmap+0x7c/0xf0
Jun 23 09:47:21 lyssa kernel: [ 3584.509996]  [<ffffffff80232674>] ? mmput+0x2c/0xa2
Jun 23 09:47:21 lyssa kernel: [ 3584.509999]  [<ffffffff802379e9>] ? do_exit+0x25a/0x6a6
Jun 23 09:47:21 lyssa kernel: [ 3584.510003]  [<ffffffff80237ea2>] ? do_group_exit+0x6d/0x9d
Jun 23 09:47:21 lyssa kernel: [ 3584.510006]  [<ffffffff80240203>] ? get_signal_to_deliver+0x302/0x324
Jun 23 09:47:21 lyssa kernel: [ 3584.510010]  [<ffffffff8020b2aa>] ? do_notify_resume+0xaf/0x7fc
Jun 23 09:47:21 lyssa kernel: [ 3584.510013]  [<ffffffff802354a7>] ? printk+0x4e/0x56
Jun 23 09:47:21 lyssa kernel: [ 3584.510016]  [<ffffffff8022bea4>] ? task_rq_lock+0x4d/0x7f
Jun 23 09:47:21 lyssa kernel: [ 3584.510022]  [<ffffffff8023e3fa>] ? signal_wake_up+0x21/0x30
Jun 23 09:47:21 lyssa kernel: [ 3584.510024]  [<ffffffff8023e8b0>] ? send_signal+0x1bf/0x1db
Jun 23 09:47:21 lyssa kernel: [ 3584.510030]  [<ffffffff8020c5b4>] ? retint_signal+0x50/0x9c
Jun 23 09:47:21 lyssa kernel: [ 3584.510037] 
Jun 23 09:47:21 lyssa kernel: [ 3584.510037] 
Jun 23 09:47:21 lyssa kernel: [ 3584.510039]  RSP <ffff81009d415be8>
Jun 23 09:47:21 lyssa kernel: [ 3584.510039] ---[ end trace 354ceeafb064bf8d ]---
I was running an open shell prompt which also spit this out:

Code:
Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509879] Eeek! page_mapcount(page) went negative! (-1)

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509881]   page pfn = bc676

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509882]   page->flags = 10000000000083c

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509883]   page->count = 2

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509884]   page->mapping = ffff81011613ab50

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509898]   vma->vm_ops = 0x0

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509905] ------------[ cut here ]------------

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.509908] invalid opcode: 0000 [1] SMP 

Message from syslogd@lyssa at Jun 23 09:47:21 ...
 kernel:[ 3584.510038] Code: 80 e8 4f e7 fc ff 48 8b 85 90 00 00 00 48 85 c0 74 19 48 8b 40 20 48 85 c0 74 10 48 8b 70 58 48 c7 c7 44 48 4b 80 e8 2a e7 fc ff <0f> 0b eb fe 8b 77 18 5a 5b 5d 83 e6 01 f7 de 83 c6 04 e9 95 54
Any thoughts?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 08:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration