LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices

Reply
 
LinkBack Search this Thread
Old 12-09-2012, 07:46 PM   #1
chexmix
Member
 
Registered: Apr 2002
Location: Arlington, MA
Distribution: Slackware, Debian, OpenBSD
Posts: 207
Blog Entries: 15

Rep: Reputation: 17
Unhappy How to diagnose system freeze


Hi all -

I've been enjoying Slackware 14.0 on my new ZaReason laptop ... but am experiencing random system freezes: no cursor movement, keyboard unresponsive, the works.

At first I thought it might be Firefox, since I'd read in a couple of places that people were having such an issue. But the machine froze when I had a combination of seamonkey and openoffice running -- and no Firefox.

I ran memtest and there were no failures. I've been looking at syslog to see whether there's anything strange there, and there are a number of entries re: NetworkManager --

Code:
Dec  9 20:16:44 catbutt dhcpcd[1884]: timed out
Dec  9 20:16:44 catbutt dhcpcd[1884]: allowing 8 seconds for IPv4LL timeout
Dec  9 20:16:52 catbutt dhcpcd[1884]: timed out
Dec  9 20:16:58 catbutt NetworkManager[2085]: <warn> Failed to open plugin directory /usr/lib64/NetworkManager: Error opening directory '/usr/lib64/NetworkManager': No such file or directory
Dec  9 20:16:58 catbutt NetworkManager[2085]: <warn> failed to allocate link cache: (-10) Operation not supported
Dec  9 20:16:58 catbutt NetworkManager[2085]: <warn> (wlan0): driver supports Access Point (AP) mode
Dec  9 20:16:59 catbutt NetworkManager[2085]: <warn> bluez error getting default adapter: The name org.bluez was not provided by any .service files
Dec  9 20:16:59 catbutt NetworkManager[2085]: <warn> Trying to remove a non-existant call id.
Dec  9 20:17:00 catbutt dhcpcd[2119]: wlan0: sendmsg: Cannot assign requested address
... but I have no idea whether that could cause my whole system to lock up. Since they are just warnings, I'm skeptical that that is the cause.

What or where else might I check?

Thanks,

Glenn
 
Old 12-10-2012, 02:10 AM   #2
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
Can you post the output of 'lspci -k' and 'lsmod'. This is mostly for what hardware and drivers you have.

I wrote a hardware diagnostics wiki, it may help:
http://docs.slackware.com/howtos:har...re_diagnostics

I'm not sure NetworkManager can cause such a hang. Maybe the errors were caused by the hang.
 
2 members found this post helpful.
Old 12-10-2012, 06:12 AM   #3
chexmix
Member
 
Registered: Apr 2002
Location: Arlington, MA
Distribution: Slackware, Debian, OpenBSD
Posts: 207
Blog Entries: 15

Original Poster
Rep: Reputation: 17
Here is lspci -k:

Code:
00:14.0 USB controller: Intel Corporation Panther Point USB xHCI Host Controller (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: xhci_hcd
00:16.0 Communication controller: Intel Corporation Panther Point MEI Controller #1 (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: mei
00:1a.0 USB controller: Intel Corporation Panther Point USB Enhanced Host Controller #2 (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: ehci_hcd
00:1b.0 Audio device: Intel Corporation Panther Point High Definition Audio Controller (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: snd_hda_intel
00:1c.0 PCI bridge: Intel Corporation Panther Point PCI Express Root Port 1 (rev c4)
	Kernel driver in use: pcieport
00:1c.1 PCI bridge: Intel Corporation Panther Point PCI Express Root Port 2 (rev c4)
	Kernel driver in use: pcieport
00:1d.0 USB controller: Intel Corporation Panther Point USB Enhanced Host Controller #1 (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: ehci_hcd
00:1f.0 ISA bridge: Intel Corporation Panther Point LPC Controller (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
00:1f.2 SATA controller: Intel Corporation Panther Point 6 port SATA Controller [AHCI mode] (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: ahci
00:1f.3 SMBus: Intel Corporation Panther Point SMBus Controller (rev 04)
	Subsystem: COMPAL Electronics Inc Device 0065
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
	Subsystem: COMPAL Electronics Inc Device 0065
	Kernel driver in use: r8169
02:00.0 Network controller: Intel Corporation Centrino Wireless-N 1030 (rev 34)
	Subsystem: Intel Corporation Centrino Wireless-N 1030 BGN
	Kernel driver in use: iwlwifi
The only thing that sticks out for me the last entry, for wireless - I thought I had a 3945.

And here is lspci -k:

Code:
Module                  Size  Used by
snd_seq_dummy           1455  0 
snd_seq_oss            29048  0 
snd_seq_midi_event      5620  1 snd_seq_oss
snd_seq                51265  5 snd_seq_midi_event,snd_seq_oss,snd_seq_dummy
snd_seq_device          5228  3 snd_seq,snd_seq_oss,snd_seq_dummy
snd_pcm_oss            39183  0 
snd_mixer_oss          15404  2 snd_pcm_oss
ipv6                  279979  38 
pcmcia                 35720  0 
pcmcia_core            12061  1 pcmcia
cpufreq_ondemand        6252  4 
acpi_cpufreq            5773  1 
mperf                   1171  1 acpi_cpufreq
freq_table              2475  2 acpi_cpufreq,cpufreq_ondemand
lp                      9787  0 
ppdev                   5958  0 
parport_pc             19423  0 
parport                31427  3 parport_pc,ppdev,lp
fuse                   66626  3 
snd_hda_codec_hdmi     24057  1 
rts5139               342736  0 
usbhid                 35615  0 
hid                    82876  1 usbhid
snd_hda_codec_realtek   195474  1 
joydev                  9972  0 
uvcvideo               62784  0 
videodev               76679  1 uvcvideo
v4l2_compat_ioctl32     8660  1 videodev
iwlwifi               199185  0 
i915                  419107  2 
snd_hda_intel          23267  2 
r8169                  48922  0 
snd_hda_codec          81925  3 snd_hda_intel,snd_hda_codec_realtek,snd_hda_codec_hdmi
mac80211              227731  1 iwlwifi
snd_hwdep               6324  1 snd_hda_codec
snd_pcm                72864  4 snd_hda_codec,snd_hda_intel,snd_hda_codec_hdmi,snd_pcm_oss
snd_page_alloc          7081  2 snd_pcm,snd_hda_intel
snd_timer              18798  2 snd_pcm,snd_seq
snd                    57796  14 snd_timer,snd_pcm,snd_hwdep,snd_hda_codec,snd_hda_intel,snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_mixer_oss,snd_pcm_oss,snd_seq_device,snd_seq,snd_seq_oss
intel_agp              10864  1 i915
drm_kms_helper         26133  1 i915
intel_gtt              13833  3 intel_agp,i915
drm                   187389  3 drm_kms_helper,i915
psmouse                61704  0 
i2c_algo_bit            5319  1 i915
btusb                  11676  0 
mii                     3987  1 r8169
cfg80211              169025  2 mac80211,iwlwifi
bluetooth             151679  1 btusb
processor              25592  5 acpi_cpufreq
thermal                 7983  0 
fan                     2418  0 
video                  11378  1 i915
rfkill                 15428  4 bluetooth,cfg80211
mei                    32534  0 
i2c_i801                8044  0 
thermal_sys            14578  4 video,fan,thermal,processor
i2c_core               19978  6 i2c_i801,i2c_algo_bit,drm,drm_kms_helper,i915,videodev
agpgart                27372  3 drm,intel_gtt,intel_agp
serio_raw               4389  0 
ac                      3331  0 
hwmon                   1329  1 thermal_sys
soundcore               5474  2 snd
battery                11171  0 
evdev                   9574  10 
button                  4529  1 i915
loop                   18192  0
Thanks for looking.

/Glenn
 
Old 12-10-2012, 06:28 AM   #4
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
Does it always freeze the same way ?

It could be hardware or it could be some driver. I remember there were some intel video driver issues on 13.37, but I don't think they apply to 14.0 and they look different:
http://www.linuxquestions.org/questi...ng-4175425214/
 
1 members found this post helpful.
Old 12-10-2012, 06:33 AM   #5
kooru
Senior Member
 
Registered: Sep 2012
Location: Italy
Distribution: Slackware, NetBSD
Posts: 1,097
Blog Entries: 3

Rep: Reputation: 237Reputation: 237Reputation: 237
Quote:
Originally Posted by chexmix View Post
but am experiencing random system freezes: no cursor movement, keyboard unresponsive, the works.
Same thing for me.
I resolved upgrading the kernel.
you can see here
 
2 members found this post helpful.
Old 12-10-2012, 06:33 AM   #6
onebuck
Moderator
 
Registered: Jan 2005
Location: Midwest USA, Central Illinois
Distribution: SlackwareŽ
Posts: 10,882
Blog Entries: 1

Rep: Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307
Member Response

Hi,

What about switching to another console or 'ssh' into the box to see if system is actually frozen?
 
1 members found this post helpful.
Old 12-10-2012, 06:40 AM   #7
chexmix
Member
 
Registered: Apr 2002
Location: Arlington, MA
Distribution: Slackware, Debian, OpenBSD
Posts: 207
Blog Entries: 15

Original Poster
Rep: Reputation: 17
Quote:
Originally Posted by H_TeXMeX_H View Post
Does it always freeze the same way ?
Well, I typically notice it via the mouse cursor freezing in place ... there doesn't seem to be a common thread re: what kind of work I happen to be doing.

The machine locks hard: CTRL-ALT-DEL does nothing, for what that's worth.

/G
 
Old 12-10-2012, 07:01 AM   #8
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
Try Alt-SysRq REISUB:
http://en.wikipedia.org/wiki/Reisub
 
2 members found this post helpful.
Old 12-10-2012, 08:41 AM   #9
chexmix
Member
 
Registered: Apr 2002
Location: Arlington, MA
Distribution: Slackware, Debian, OpenBSD
Posts: 207
Blog Entries: 15

Original Poster
Rep: Reputation: 17
Quote:
Originally Posted by H_TeXMeX_H View Post
Thanks -- I will try that.

Just noted before I left for work: there are some troubling lines re: ACPI in /var/log/messages (wish I'd had time to copy/paste them, but I was running late).

Also, when I unplugged the AC power cord, the battery monitor said my battery was at 89%. This thing's been plugged in a loooong time ... could this be a bad battery issue?

/Glenn
 
1 members found this post helpful.
Old 12-10-2012, 10:26 PM   #10
elyk
Member
 
Registered: Jun 2004
Distribution: Slackware
Posts: 160

Rep: Reputation: 23
I have a machine that would act similar to what you describe -- keyboard and mouse freeze at random, pressing numlock/capslock don't change the keyboard LEDs, Alt+SysRq doesn't seem to be recognized. But I could SSH in after it happens. I think adding 'nolapic' to the kernel parameters fixed it.
 
3 members found this post helpful.
Old 12-11-2012, 06:47 AM   #11
chexmix
Member
 
Registered: Apr 2002
Location: Arlington, MA
Distribution: Slackware, Debian, OpenBSD
Posts: 207
Blog Entries: 15

Original Poster
Rep: Reputation: 17
Question

Quote:
Originally Posted by elyk View Post
I have a machine that would act similar to what you describe -- keyboard and mouse freeze at random, pressing numlock/capslock don't change the keyboard LEDs, Alt+SysRq doesn't seem to be recognized. But I could SSH in after it happens. I think adding 'nolapic' to the kernel parameters fixed it.
I'm still waiting for another freeze to try things out (thanks everyone!) ...

Did a little Googling on 'nolapic' ... doesn't this slow down performance? In one place I could swear I read that it essentially turned a multicore machine into single core.
 
Old 12-11-2012, 07:21 AM   #12
onebuck
Moderator
 
Registered: Jan 2005
Location: Midwest USA, Central Illinois
Distribution: SlackwareŽ
Posts: 10,882
Blog Entries: 1

Rep: Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307
Member Response

Hi,

Quote:
Originally Posted by chexmix View Post
I'm still waiting for another freeze to try things out (thanks everyone!) ...

Did a little Googling on 'nolapic' ... doesn't this slow down performance? In one place I could swear I read that it essentially turned a multicore machine into single core.
From http://www.kernel.org/doc/Documentat...parameters.txt
Quote:
noapic [SMP,APIC] Tells the kernel to not make use of any IOAPICs that may be present in the system.
Quote:
nolapic [X86-32,APIC] Do not enable or use the local APIC.
Please notice the underlined qualifier in the above quotes. You can use noapic for 'SMP' and 'nolapic' for 32bit which are in the APIC classifier.

HTH!
 
1 members found this post helpful.
Old 12-11-2012, 08:47 AM   #13
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
So then nolapic would have no effect on 64-bit systems ?
 
1 members found this post helpful.
Old 12-11-2012, 09:07 AM   #14
chexmix
Member
 
Registered: Apr 2002
Location: Arlington, MA
Distribution: Slackware, Debian, OpenBSD
Posts: 207
Blog Entries: 15

Original Poster
Rep: Reputation: 17
Quote:
Originally Posted by H_TeXMeX_H View Post
So then nolapic would have no effect on 64-bit systems ?
I'd like to know this as well. This box is Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz. I'm running Slack64 on it.

Would I use nolapic or noapic for a dual core machine?

Update: just had a freeze. Machine did not respond to the magic SysRq sequences. So I'm ready to try the no[l]apic thing. Where in lilo.conf do I add it (I can Google this, but I thought I'd ask while I was here)?

EDIT: I assume it goes under "# Append any additional kernel parameters:", and look lie this:

append = "noapic"

(or nolapic)

Thanks,

Glenn

Last edited by chexmix; 12-11-2012 at 09:11 AM.
 
Old 12-11-2012, 11:15 AM   #15
onebuck
Moderator
 
Registered: Jan 2005
Location: Midwest USA, Central Illinois
Distribution: SlackwareŽ
Posts: 10,882
Blog Entries: 1

Rep: Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307Reputation: 1307
Member Response

Hi,

I would use 'addappend=' in the stanza unless you wish global changes then use 'append=' within global section;
Quote:
From 'man lilo.conf;
addappend=<string>
(22.6) The kernel parameters from the specified string, are concatenated to the parameter(s) from an append= specification (see below). The
string must be enclosed within double quotes. Usually, the previous append= will specify parameters common to all kernels by appearing in the
top, or global, section of the configuration file and addappend= will be used to add local parameter(s) to an individual image. Addappend= may
be used only once per "image=" section.

append=<string>
Appends the options specified to the parameter line passed to the kernel. This is typically used to specify hardware parameters that can't be
entirely auto-detected or for which probing may be dangerous. Multiple kernel parameters are separated by a blank space, and the string must be
enclosed in double quotes. A local append= appearing withing an image= section overrides any global append= appearing in the top section of the
configuration file. Append= may be used only once per "image=" section. To concatenate parameter strings, use "addappend=". Example:

append="mem=96M hd=576,64,32 console=ttyS1,9600"
A little old but applicable;
Quote:
From http://osdev.berlios.de/pic.html

1.Introduction There are basically two things here to consider.
  1. Built into all recent x86 CPU chips (Pent Pro and up) is a thing called a Local APIC. It is addressed at physical addresses FEE00xxx. Actually, that is the default, it can be moved by programming the MSR that holds it base address.
    It has many fun things in it. The big thing is that you can interrupt other CPU's in a multiprocessor system. But if you just have a uniprocessor, there are useful things for it, too.
    The Local APIC is described in Chapter 7 of Volume 3 of the Intel processor books.
  2. Some motherboards have an IO APIC on them. This is usually only found on multiprocessor boards. Functionally, it replaces the 8259's. You must essentially shut off the 8259's and turn on the IO APIC to use it.
    The IO APIC is typically located at physical address FEC00000, but may be moved by programming the north/southbridge chipset.
    The Intel chip number is 82093 and you can get the doc for it off of the Intel website.
2.What the Local APIC Is
As stated above, the Local APIC (LAPIC) is a circuit that is part of the CPU chip. It contains these basic elements:
  1. A mechanism for generating interrupts
  2. A mechanism for accepting interrupts
  3. A timer
If you have a multiprocessor system, the APIC's are wired together so they can communicate. So the LAPIC on CPU 0 can communicate with the LAPIC on CPU 1, etc.

3.What the IO APIC Is This is a separate chip that is wired to the Local APIC's so it can forward interrupts on to the CPU chips. It is programmed similar to the 8259's but has more flexibility.
It is wired to the same bus as the Local APIC's so it can communicate with them.

4.Fun things to do with a Local APIC in a Uniprocessor this stuff also applies to multiprocessors, too One thing the LAPIC can help with is the following problem:
An IRQ-type interrupt routine wishes to wake a sleeping thread, but this IRQ interrupt may be nested several levels inside other IRQ interrupts, so it cannot simply switch stacks as those outer interrupt routines would not complete until the old thread is re-woken.
So we have to somehow switch out of the current thread and switch into the thread to be woken. A way the LAPIC can help us is to tell it to interrupt this same CPU, but only when there are no IRQ-type interrupt handlers active.
I call this a 'software' interrupt because the operating system software initiated the interrupt. It is programmed into the LAPIC to be at a priority lower than any IRQ-type interrupt.
So now if some IRQ-type routine wants to wake a thread, it makes the necessary changes to the datastructures, then triggers a software interrupt to itself. Then, when all IRQ-type interrupt handlers have returned out, the LAPIC is now able to interrupt.It interrupts out of the currently executing thread and switches to the thread that was just woken. Very neat.
Without the LAPIC, your interrupt routine has to set a flag in memory somewhere that each IRET has to check for. So each IRET checks this flag and checks to see if it is the 'last' IRET. It is more efficient to let the LAPIC do this testing for you.
So now we have to make this software LAPIC interrupt have a lower priority than IRQ interrupts. We do this by studying how the LAPIC assigns priority to interrupts. This is a bit lame but it works ok. The priority is based on the vector number we choose for the interrupt. Interrupt vectors are numbered 0x00 through 0xFF in Intel CPUs. The LAPIC assigns a priority based on the first of the two hex digits and ignores the second digit. Thus, any interrupts using vectors 0x50 through 0x5F have the same priority. So if you block something at priority 0x52, you block all interrupts in the range 0x50 through 0x5F.
Now the CPU itself uses vectors in the range 0x00..0x1F for exceptions, so we don't want to use those for LAPIC interrupts. This means we can use a vector numbered 0x20 or 0x2F or somewhere in that range. We will have to redirect the IRQ interrupts to vectors 0x30..0x3F or something even higher if necessary, by re-programming the 8295's. Now we can block software interrupts without blocking IRQ interrupts.
The LAPIC's priority can be set by writing the LAPIC's TSKPRI (task priority) register. So if you want to block all interrupts through level 0x2F, just write a 0x20 (or 0x2B, etc) into the TSKPRI and you have blocked those interrupts.
Now the LAPIC is not really connected to the 8259's. You cannot block 8259 generated interrupts with the LAPIC. Likewise, being in an IRQ-type interrupt handler does not block any LAPIC interrupts. So we have to manually block/unblock the softints at the beginning of our IRQ handler. Just push the LAPIC's TSKPRI register, set it to 0x20 and handle your IRQ interrupt as usual. When done, pop the saved LAPIC's TSKPRI then IRET.
SMP should help you understand multiprocessor or multi-core processor.

HTH!
 
2 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to diagnose a system freeze Knightron Linux - Newbie 8 11-14-2012 05:51 PM
How do I diagnose a computer freeze? hosler Linux - Hardware 6 01-20-2010 01:17 AM
System Freeze Snigger Linux - Hardware 6 06-20-2009 12:12 AM
memory leak (I think) paralyses my system - how to diagnose? Moebius Linux - Software 2 12-15-2004 03:44 PM


All times are GMT -5. The time now is 11:31 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration