pppd causing kernel crash
Hi all,
First off an intro, I have a computer which is running a custom cd distro, and connects to the internet through a usb isdn modem (it uses the cdc-acm module) with pppd. It has been working fine up until the well known cdc-acm troubles in the kernel since 2.6.8. I have now migrated it to 2.6.11.9 and the computer dials up fine but if it disconnects it then comes up with this when it tries to reconnect: Code:
Unable to handle kernel NULL pointer dereference at virtual address 00000000 Here's a list of the modules loaded BEFORE the crash happens, I will update this with another one after it happens again. Code:
Module Size Used by |
Quote:
now the problem is i don't see this address in the stack or the call trace is there further output after the Code: section you did not show us ? as it is at least for my limited understanding i can't diagnose what caused the crash from this output |
The output is copied straight from the logs, there isn't any other mention of anything going wrong - as in samba still works, networking still works...I can still play videos and things. Just pppd won't run anymore, and it won't allow me to unload the modules.
I can tell you how it happens though, i'll make a timeline.. 1. It will be sitting on the net with pppd and chat running, with the modem on ttyACM0 2. Either the ISP will send a hangup to us (every 4 hours), OR the kernel (through electrical interference or otherwise) will detect or cause a usb hub timeout, which then re-initializes the modem as ttyACM1 (or ttyACM0 if the previous step finds it as ttyACM1...it swaps between the two) 3. Since pppd is set to persist, it should stay open and redial automatically if the modem hasn't changed ports. *** This is where the problem happens. pppd used to continue trying the old port forever, so if it hadn't changed then it would reconnect. If the modem had changed ports, I was able to kill off pppd, change the port, and re-run it and everything would be back working. Yet now if it gets disconnected from the other end or the usb has disconnected and reconnected the modem, it comes up with the kernel error, and pppd is killed off (presumably something to do with the error, rather than its own choice). I can't unload the modules, and if I re-run pppd it does run, but does not do anything and I am unable to kill it off at all. The only way out is to reset the machine. Could it be a bug in the 2.6.11.9 kernel or cdc-acm module? or maybe mod-utils, or pppd? I have run ram tests and processor tests on the host computer, and tried re-compiling incase it was a buggy image, and even tried different cd's and different cd drives and everything checks out. The only software I have changed is the kernel. -------------------------------- Edit: Just had it happen 15 minutes ago, pppd quit and the usb disconnected the modem, but I was able to re-run pppd with no troubles. Might be slightly intermittent although It has happened every time over the past couple of days. I am going ahead with the 2.6.11.11 kernel to see if that helps...will know late tonight and post if it's a success. |
Well the new kernel is in play (2.6.11.11), and just got the same problem. I'll post the logs below, they both overlap by the way (messages and syslog):
Code:
Jun 2 15:25:27 (none) kernel: hub 2-0:1.0: port 2 disabled by hub (EMI?), re-enabling... Code:
Jun 2 15:18:49 (none) -- MARK -- The very first line (Jun 2 15:25:27 (none) kernel: hub 2-0:1.0: port 2 disabled by hub (EMI?), re-enabling...) seems to be caused either by the thing timing out on its own, or sometimes by people switching lights on/off and fan switches. This last problem was a fan switch. I believe it's possible to disconnect the usb cable and reconnect it manually without any problems, but I will test that another time. The problem is that when pppd tries to connect after this glitch, it runs and then freezes, with no way to kill it. The cdc_acm module will not unload either - the only way out is a reboot. One thing I have changed is that I used to have the usb ohci/uhci/etc, and cdc_acm built in to the kernel (all with 2.6.7). Could it be something to do with it being a module that is causing this? Thanks for any help - i'll be glad when this is over! |
pppd problems
I think I have the same problems with you. Though I don have a console to see what happens.
I got a custom dist based on debian and kernel 2.6.11 pppd version 2.4.3 The system halts completely!! |
My system doesn't halt completely, infact everything works perfectly except pppd freezes when trying to talk to the modem after this has happened.
What I have done is compiled the uhci/ohci/etc in to the kernel, along with cdc-acm. I have only tried it for about 24 hours now and while the modem still resets (due to lights/fans and other electrical interference) it comes back flawlessly, with nothing written to the logs. If pppd didn't quit before its due time, I wouldn't have any clue that this problem was happening. The modem still changes usb id's though. The length of the usb cable didn't seem to make any difference either. So If this continues working, then pppd freezing only happens if cdc-acm and/or the uhci/ohci drivers are built as modules, and the usb hub decides to reset due to EMI. Hopefully this helps someone in the future at least! |
No luck
I recompiled the kernel with USB bult in but
unfortunately it didn't work. In some point the call is interuppted and the system halts and I mean halts completely. I got kernel version 2.6.11_10. By the way I notesed that system crashes also if I unplug the modem usb cable from the system. But only if ------ mgetty /dev/usb/acm/ttyACM0 The compination of a mgetty watching the ttyACM0 and usb unplug causes the kernel to crash. If the mgetty is not running and plug unplug the modem is ok. I don't get any problem. Any one any ideas? |
Not sure why your whole machine would be stopping like that - are you sure you're not able to get any logs at all? What I would do if I was in that situation is run the usual ram and processor tests (http://www.ultimatebootcd.com). Without logs though it's a little difficult to know where to look after that, Sorry.
|
Of course I get logs
the kernel dumps a stack trace on the console. the last part it looks like this : Code: 8b 93 ... bla bla bla <1> Unable to handle kernel NULL pointer dereference at virtual address 00000000020 printing eip: C0116588 *pde=0000000000 Oops: 0000 [#16] SMP Modules linked in : evdev ....... bla bla bla CPU: 0 EIP: 0060:[<c0116588>] Not tinted vli EFLAGS: 00010282 (2.6.11.11.PENTIUM-M) EIP is at m_release + 0x38/0xa0 Do you know if it dumps it also in a file? |
If it's spitting it out to the console then you're probably not running the syslog daemon which would store it in /var/log/messages or /var/log/syslog.
I noticed it mentioned SMP there though, if compiling the kernel without any SMP or preempt support at all isn't too much trouble then try that and see if it helps. That's about all I can get from that info :( |
Nothing, thanx any way!
I'll start trying older versions. It seem to be more stable with version 2.4.(37-something) I can unplug now with out kernel crash. But this version has a problem with ups-hid I can't connect to my MGE UPS now. What a mess!!!! |
All times are GMT -5. The time now is 09:25 PM. |