LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 05-28-2005, 08:12 PM   #1
lukebeales
Member
 
Registered: Oct 2003
Location: Australia
Distribution: Slackware/LFS/Ubuntu
Posts: 89

Rep: Reputation: 15
pppd causing kernel crash


Hi all,
First off an intro, I have a computer which is running a custom cd distro, and connects to the internet through a usb isdn modem (it uses the cdc-acm module) with pppd. It has been working fine up until the well known cdc-acm troubles in the kernel since 2.6.8. I have now migrated it to 2.6.11.9 and the computer dials up fine but if it disconnects it then comes up with this when it tries to reconnect:

Code:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c02048fa
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: bsd_comp ppp_deflate ppp_async ppp_generic slhc cdc_acm cifs smbfs parport_pc parport usblp 8139too tvaudio tuner bttv video_buf firmware_class i2c_algo_bit btcx_risc tveeprom i2c_core lirc_serial lirc_dev sermouse psmouse atkbd libps2 serport i8042 serio mousedev evdev usbhid usbserial uhci_hcd ohci_hcd ehci_hcd usbcore snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_mpu401_uart snd_rawmidi snd_seq_device snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore pcspkr
CPU:    0
EIP:    0060:[<c02048fa>]    Tainted: GF     VLI
EFLAGS: 00010286   (2.6.11.9)
eax: 00000000   ebx: 00000001   ecx: ffffffff   edx: cee920b8
esi: c8db7b18   edi: 00000000   ebp: cee920b8   esp: c0041d58
ds: 007b   es: 007b   ss: 0068
Process pppd (pid: 2003, threadinfo=c0041000 task=c0ae9a80)
Stack: c8db7b00 cee92094 c0204966 cee920b8 000003ad c8db7b00 c8db7b18 cee92094
       000003ad c022ca26 cee920b8 000000d0 00000000 c8db790c 00000000 00000000
       c8db7b00 c478a037 c478a053 c478a000 c02052cd c03b9080 c8db7914 c8db7b18
Call Trace:
 [<c0204966>]
 [<c022ca26>]
 [<c02052cd>]
 [<c0204bff>]
 [<c022cd5b>]
 [<c022cd7d>]
 [<d0a1f3a4>]
 [<c021bbf2>]
 [<c01250f8>]
 [<c0146f5e>]
 [<c013f1b4>]
 [<c013fad9>]
 [<c021c1a2>]
 [<c013bb27>]
 [<c0134cc8>]
 [<c0134c02>]
 [<c0134e8e>]
 [<c0101e57>]
Code: 56 e8 93 ff ff ff 89 c3 58 85 db 74 07 56 e8 af 75 f5 ff 5e 89 d8 5b 5e c3 57 53 8b 54 24 0c bb 01 00 00 00 8b 3a 31 c0 83 c9 ff <f2> ae f7 d1 49 8b 52 24 8d 5c 0b 01 85 d2 75 e9 89 d8 5b 5f c3
This used to mention PREEMT, so I removed preemption and it now shows the message you see above. Is this a pppd incompatability with the new kernel? Or is it something wrong in the kernel itself (given it's the only thing I have changed)?

Here's a list of the modules loaded BEFORE the crash happens, I will update this with another one after it happens again.

Code:
Module                  Size  Used by
bsd_comp                4224  0
ppp_deflate             4096  0
ppp_async               7680  1
ppp_generic            15380  7 bsd_comp,ppp_deflate,ppp_async
slhc                    4864  1 ppp_generic
cdc_acm                 8224  2
cifs                  150904  0
smbfs                  45304  0
parport_pc             26436  0
parport                21320  1 parport_pc
usblp                   8832  0
8139too                16128  0
tvaudio                14884  0
tuner                  14756  0
bttv                  117584  0
video_buf              11012  1 bttv
firmware_class          5760  1 bttv
i2c_algo_bit            6536  1 bttv
btcx_risc               2568  1 bttv
tveeprom                8600  1 bttv
i2c_core               11408  5 tvaudio,tuner,bttv,i2c_algo_bit,tveeprom
lirc_serial             9056  1
lirc_dev                9612  1 lirc_serial
sermouse                3584  0
psmouse                18312  0
atkbd                  10384  0
libps2                  2944  2 psmouse,atkbd
serport                 2688  0
i8042                   8028  0
serio                   7304  7 sermouse,psmouse,atkbd,serport,i8042
mousedev                7448  1
evdev                   6528  0
usbhid                 19712  0
usbserial              19944  0
uhci_hcd               22032  0
ohci_hcd               12168  0
ehci_hcd               20616  0
usbcore                69624  8 cdc_acm,usblp,usbhid,usbserial,uhci_hcd,ohci_hcd,ehci_hcd
snd_seq_oss            20864  0
snd_seq_midi_event      3456  1 snd_seq_oss
snd_seq                29616  4 snd_seq_oss,snd_seq_midi_event
snd_pcm_oss            36896  0
snd_mixer_oss          12800  1 snd_pcm_oss
snd_mpu401_uart         4096  0
snd_rawmidi            13856  1 snd_mpu401_uart
snd_seq_device          4360  2 snd_seq_oss,snd_rawmidi
snd_intel8x0           20032  0
snd_ac97_codec         46720  1 snd_intel8x0
snd_pcm                49416  3 snd_pcm_oss,snd_intel8x0,snd_ac97_codec
snd_timer              13700  2 snd_seq,snd_pcm
snd_page_alloc          5636  2 snd_intel8x0,snd_pcm
snd                    26852  11 snd_seq_oss,snd_seq,snd_pcm_oss,snd_mixer_oss,snd_mpu401_uart,snd_rawmidi,snd_seq_device,snd_intel8x0,snd_ac97_codec,snd_pcm,snd_timer
soundcore               4320  1 snd
pcspkr                  2660  0
I guess what I'm wanting to know is should I go to all the trouble of putting a new kernel in (about 12 hours work) in hopes it will fix it, or could it be something else that's now become outdated?
 
Old 05-28-2005, 09:04 PM   #2
foo_bar_foo
Senior Member
 
Registered: Jun 2004
Posts: 2,553

Rep: Reputation: 53
Quote:
EIP: 0060:[<c02048fa>] Tainted: GF VLI
this is the pointer to the last executed instruction before the crash

now the problem is i don't see this address in the stack or the call trace

is there further output after the Code: section you did not show us ?

as it is at least for my limited understanding i can't diagnose what caused the crash from this output
 
Old 05-28-2005, 09:43 PM   #3
lukebeales
Member
 
Registered: Oct 2003
Location: Australia
Distribution: Slackware/LFS/Ubuntu
Posts: 89

Original Poster
Rep: Reputation: 15
The output is copied straight from the logs, there isn't any other mention of anything going wrong - as in samba still works, networking still works...I can still play videos and things. Just pppd won't run anymore, and it won't allow me to unload the modules.

I can tell you how it happens though, i'll make a timeline..

1. It will be sitting on the net with pppd and chat running, with the modem on ttyACM0

2. Either the ISP will send a hangup to us (every 4 hours), OR the kernel (through electrical interference or otherwise) will detect or cause a usb hub timeout, which then re-initializes the modem as ttyACM1 (or ttyACM0 if the previous step finds it as ttyACM1...it swaps between the two)

3. Since pppd is set to persist, it should stay open and redial automatically if the modem hasn't changed ports.

*** This is where the problem happens. pppd used to continue trying the old port forever, so if it hadn't changed then it would reconnect. If the modem had changed ports, I was able to kill off pppd, change the port, and re-run it and everything would be back working.

Yet now if it gets disconnected from the other end or the usb has disconnected and reconnected the modem, it comes up with the kernel error, and pppd is killed off (presumably something to do with the error, rather than its own choice). I can't unload the modules, and if I re-run pppd it does run, but does not do anything and I am unable to kill it off at all. The only way out is to reset the machine.

Could it be a bug in the 2.6.11.9 kernel or cdc-acm module? or maybe mod-utils, or pppd? I have run ram tests and processor tests on the host computer, and tried re-compiling incase it was a buggy image, and even tried different cd's and different cd drives and everything checks out. The only software I have changed is the kernel.

--------------------------------

Edit: Just had it happen 15 minutes ago, pppd quit and the usb disconnected the modem, but I was able to re-run pppd with no troubles. Might be slightly intermittent although It has happened every time over the past couple of days.

I am going ahead with the 2.6.11.11 kernel to see if that helps...will know late tonight and post if it's a success.

Last edited by lukebeales; 05-29-2005 at 12:12 AM.
 
Old 06-02-2005, 10:54 AM   #4
lukebeales
Member
 
Registered: Oct 2003
Location: Australia
Distribution: Slackware/LFS/Ubuntu
Posts: 89

Original Poster
Rep: Reputation: 15
Well the new kernel is in play (2.6.11.11), and just got the same problem. I'll post the logs below, they both overlap by the way (messages and syslog):

Code:
Jun  2 15:25:27 (none) kernel: hub 2-0:1.0: port 2 disabled by hub (EMI?), re-enabling...
Jun  2 15:25:28 (none) kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
Jun  2 15:25:28 (none) kernel:  printing eip:
Jun  2 15:25:28 (none) kernel: c020a8a2
Jun  2 15:25:28 (none) kernel: *pde = 00000000
Jun  2 15:25:28 (none) kernel: Oops: 0000 [#1]
Jun  2 15:25:28 (none) kernel: Modules linked in: bsd_comp ppp_deflate ppp_async ppp_generic slhc cdc_acm cifs smbfs parport_pc parport usblp 8139too tvaudio tuner bttv video_buf firmware_class i2c_algo_bit btcx_risc tveeprom i2c_core lirc_serial lirc_dev sermouse psmouse atkbd libps2 serport i8042 serio mousedev evdev usbhid usbserial uhci_hcd ohci_hcd ehci_hcd usbcore snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_mpu401_uart snd_rawmidi snd_seq_device snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore pcspkr
Jun  2 15:25:28 (none) kernel: CPU:    0
Jun  2 15:25:29 (none) kernel: EIP:    0060:[<c020a8a2>]    Tainted: GF     VLI
Jun  2 15:25:29 (none) kernel: EFLAGS: 00010286   (2.6.11.11) 
Jun  2 15:25:29 (none) kernel: eax: 00000000   ebx: 00000001   ecx: ffffffff   edx: cb1420b8
Jun  2 15:25:29 (none) kernel: esi: ca8bca18   edi: 00000000   ebp: cb1420b8   esp: c57fcde0
Jun  2 15:25:29 (none) kernel: ds: 007b   es: 007b   ss: 0068
Jun  2 15:25:29 (none) kernel: Process pppd (pid: 2904, threadinfo=c57fc000 task=c32f5a00)
Jun  2 15:25:29 (none) kernel: Stack: ca8bca00 cb142094 c020a90e cb1420b8 000003ad ca8bca00 ca8bca18 cb142094 
Jun  2 15:25:29 (none) kernel:        000003ad c02329c6 cb1420b8 000000d0 00000000 c952818c 00000000 00000000 
Jun  2 15:25:29 (none) kernel:        ca8bca00 c976a837 c976a853 c976a800 c020b275 c03c0500 c9528194 ca8bca18 
Jun  2 15:25:29 (none) kernel: Call Trace:
Jun  2 15:25:29 (none) kernel:  [<c020a90e>]
Jun  2 15:25:29 (none) kernel:  [<c02329c6>]
Jun  2 15:25:29 (none) kernel:  [<c020b275>]
Jun  2 15:25:29 (none) kernel:  [<c020aba7>]
Jun  2 15:25:29 (none) kernel:  [<c0232cfb>]
Jun  2 15:25:29 (none) kernel:  [<c0232d1d>]
Jun  2 15:25:29 (none) kernel:  [<d0a1f3a4>]
Jun  2 15:25:29 (none) kernel:  [<c0221b92>]
Jun  2 15:25:29 (none) kernel:  [<c010d8ff>]
Jun  2 15:25:29 (none) kernel:  [<c0110029>]
Jun  2 15:25:29 (none) kernel:  [<c011164b>]
Jun  2 15:25:29 (none) kernel:  [<c03464c8>]
Jun  2 15:25:29 (none) kernel:  [<c010c6ae>]
Jun  2 15:25:29 (none) kernel:  [<c0345e9d>]
Jun  2 15:25:29 (none) kernel:  [<c0142804>]
Jun  2 15:25:29 (none) kernel:  [<c0222392>]
Jun  2 15:25:29 (none) kernel:  [<c01360f9>]
Jun  2 15:25:29 (none) kernel:  [<c0134f63>]
Jun  2 15:25:29 (none) kernel:  [<c0134faf>]
Jun  2 15:25:29 (none) kernel:  [<c0101e57>]
Jun  2 15:25:29 (none) kernel: Code: 56 e8 93 ff ff ff 89 c3 58 85 db 74 07 56 e8 4f 57 f5 ff 5e 89 d8 5b 5e c3 57 53 8b 54 24 0c bb 01 00 00 00 8b 3a 31 c0 83 c9 ff <f2> ae f7 d1 49 8b 52 24 8d 5c 0b 01 85 d2 75 e9 89 d8 5b 5f c3
Code:
Jun  2 15:18:49 (none) -- MARK --
Jun  2 15:25:27 (none) kernel: usb 2-2: USB disconnect, address 2
Jun  2 15:25:27 (none) pppd[2904]: Hangup (SIGHUP)
Jun  2 15:25:27 (none) pppd[2904]: Modem hangup
Jun  2 15:25:27 (none) pppd[2904]: Connect time 84.4 minutes.
Jun  2 15:25:27 (none) pppd[2904]: Sent 13655914 bytes, received 46880023 bytes.
Jun  2 15:25:27 (none) pppd[2904]: Connection terminated.
Jun  2 15:25:27 (none) kernel: usb 2-2: new full speed USB device using uhci_hcd and address 3
Jun  2 15:25:27 (none) kernel: cdc_acm 2-2:1.0: ttyACM1: USB ACM device
Jun  2 15:26:12 (none) pppd[3342]: pppd 2.4.3 started by root, uid 0
Jun  2 15:26:12 (none) pppd[3342]: Removed stale lock on input_ttyACM0 (pid 2904)

The very first line (Jun 2 15:25:27 (none) kernel: hub 2-0:1.0: port 2 disabled by hub (EMI?), re-enabling...) seems to be caused either by the thing timing out on its own, or sometimes by people switching lights on/off and fan switches. This last problem was a fan switch. I believe it's possible to disconnect the usb cable and reconnect it manually without any problems, but I will test that another time.

The problem is that when pppd tries to connect after this glitch, it runs and then freezes, with no way to kill it. The cdc_acm module will not unload either - the only way out is a reboot.

One thing I have changed is that I used to have the usb ohci/uhci/etc, and cdc_acm built in to the kernel (all with 2.6.7). Could it be something to do with it being a module that is causing this? Thanks for any help - i'll be glad when this is over!
 
Old 06-04-2005, 11:54 AM   #5
Ablaoublas
LQ Newbie
 
Registered: Jun 2005
Posts: 4

Rep: Reputation: 0
Post pppd problems

I think I have the same problems with you. Though I don have a console to see what happens.
I got a custom dist based on debian
and kernel 2.6.11
pppd version 2.4.3
The system halts completely!!
 
Old 06-04-2005, 12:21 PM   #6
lukebeales
Member
 
Registered: Oct 2003
Location: Australia
Distribution: Slackware/LFS/Ubuntu
Posts: 89

Original Poster
Rep: Reputation: 15
My system doesn't halt completely, infact everything works perfectly except pppd freezes when trying to talk to the modem after this has happened.

What I have done is compiled the uhci/ohci/etc in to the kernel, along with cdc-acm. I have only tried it for about 24 hours now and while the modem still resets (due to lights/fans and other electrical interference) it comes back flawlessly, with nothing written to the logs. If pppd didn't quit before its due time, I wouldn't have any clue that this problem was happening. The modem still changes usb id's though. The length of the usb cable didn't seem to make any difference either.

So If this continues working, then pppd freezing only happens if cdc-acm and/or the uhci/ohci drivers are built as modules, and the usb hub decides to reset due to EMI. Hopefully this helps someone in the future at least!
 
Old 06-05-2005, 02:46 AM   #7
Ablaoublas
LQ Newbie
 
Registered: Jun 2005
Posts: 4

Rep: Reputation: 0
No luck

I recompiled the kernel with USB bult in but
unfortunately it didn't work.
In some point the call is interuppted and the system halts
and I mean halts completely.

I got kernel version 2.6.11_10.

By the way I notesed that system crashes also if I unplug
the modem usb cable from the system.
But only if ------ mgetty /dev/usb/acm/ttyACM0

The compination of a mgetty watching the ttyACM0
and usb unplug causes the kernel to crash.

If the mgetty is not running and plug unplug the modem
is ok. I don't get any problem.

Any one any ideas?

Last edited by Ablaoublas; 06-05-2005 at 04:08 AM.
 
Old 06-05-2005, 06:06 AM   #8
lukebeales
Member
 
Registered: Oct 2003
Location: Australia
Distribution: Slackware/LFS/Ubuntu
Posts: 89

Original Poster
Rep: Reputation: 15
Not sure why your whole machine would be stopping like that - are you sure you're not able to get any logs at all? What I would do if I was in that situation is run the usual ram and processor tests (http://www.ultimatebootcd.com). Without logs though it's a little difficult to know where to look after that, Sorry.
 
Old 06-05-2005, 07:07 AM   #9
Ablaoublas
LQ Newbie
 
Registered: Jun 2005
Posts: 4

Rep: Reputation: 0
Of course I get logs
the kernel dumps a stack
trace on the console.
the last part it looks like this :


Code: 8b 93 ... bla bla bla

<1> Unable to handle kernel NULL
pointer dereference at virtual
address 00000000020

printing eip:
C0116588
*pde=0000000000
Oops: 0000 [#16]
SMP
Modules linked in : evdev ....... bla bla bla

CPU: 0
EIP: 0060:[<c0116588>] Not tinted vli
EFLAGS: 00010282 (2.6.11.11.PENTIUM-M)
EIP is at m_release + 0x38/0xa0


Do you know if it dumps it also in a file?

Last edited by Ablaoublas; 06-05-2005 at 07:31 AM.
 
Old 06-05-2005, 07:36 AM   #10
lukebeales
Member
 
Registered: Oct 2003
Location: Australia
Distribution: Slackware/LFS/Ubuntu
Posts: 89

Original Poster
Rep: Reputation: 15
If it's spitting it out to the console then you're probably not running the syslog daemon which would store it in /var/log/messages or /var/log/syslog.

I noticed it mentioned SMP there though, if compiling the kernel without any SMP or preempt support at all isn't too much trouble then try that and see if it helps. That's about all I can get from that info
 
Old 06-05-2005, 09:25 AM   #11
Ablaoublas
LQ Newbie
 
Registered: Jun 2005
Posts: 4

Rep: Reputation: 0
Nothing, thanx any way!
I'll start trying older versions.

It seem to be more stable with version 2.4.(37-something)
I can unplug now with out kernel crash.
But this version has a problem with ups-hid I can't connect
to my MGE UPS now.
What a mess!!!!


Last edited by Ablaoublas; 06-05-2005 at 04:31 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Using two Ultra100 TX2 causing Linux to crash almvtb Linux - Hardware 3 03-21-2006 03:45 PM
Help required. Skb recieve causing kernel crash !!! bhuvan007 Linux - General 1 01-09-2006 06:19 AM
Browsers causing KDE to crash juanr0 Linux - Newbie 0 02-16-2005 12:53 PM
Mandrake 10 Sound Causing System Crash FlyingDonkey007 Linux - Newbie 8 03-23-2004 09:54 PM
Icewm background program in .xinitrc causing X crash qwijibow Linux - General 4 10-21-2003 11:25 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:55 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration