LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Debian 10.4 crash, syslog.1 inside - need help reading the data (https://www.linuxquestions.org/questions/linux-newbie-8/debian-10-4-crash-syslog-1-inside-need-help-reading-the-data-4175676301/)

shuv1t 06-01-2020 04:09 PM

Debian 10.4 crash, syslog.1 inside - need help reading the data
 
My system froze after being idle for a while. I don't think what apps I had open is important but if it is then I'll add that later. When I tried interacting I could move the mouse around but clicks didn't do anything, nor did my keyboard shortcuts. This is the snippet from syslog.1 that shows the crash (20:35:54) and some afk time until I hard reset (23:05:21)

Code:

May 31 20:35:54 debian kernel: [214213.676750] xhci_hcd 0000:08:00.3: xHCI host not responding to stop endpoint command.
May 31 20:35:54 debian kernel: [214213.692757] xhci_hcd 0000:08:00.3: Host halt failed, -110
May 31 20:35:54 debian kernel: [214213.692758] xhci_hcd 0000:08:00.3: xHCI host controller not responding, assume dead
May 31 20:35:54 debian kernel: [214213.692785] xhci_hcd 0000:08:00.3: HC died; cleaning up
May 31 20:35:54 debian kernel: [214213.692856] usb 5-2: USB disconnect, device number 2
May 31 20:36:00 debian kernel: [214219.480769] rcu: INFO: rcu_sched self-detected stall on CPU
May 31 20:36:00 debian kernel: [214219.480777] rcu:    11-....: (5249 ticks this GP) idle=1ce/1/0x4000000000000002 softirq=5108446/5108446 fqs=2623
May 31 20:36:00 debian kernel: [214219.480779] rcu:      (t=5250 jiffies g=12668021 q=233)
May 31 20:36:00 debian kernel: [214219.480783] NMI backtrace for cpu 11
May 31 20:36:00 debian kernel: [214219.480786] CPU: 11 PID: 21913 Comm: ThreadPoolForeg Tainted: P          OE    4.19.0-9-amd64 #1 Debian 4.19.118-2
May 31 20:36:00 debian kernel: [214219.480787] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 4602 03/08/2019
May 31 20:36:00 debian kernel: [214219.480787] Call Trace:
May 31 20:36:00 debian kernel: [214219.480790]  <IRQ>
May 31 20:36:00 debian kernel: [214219.480796]  dump_stack+0x66/0x90
May 31 20:36:00 debian kernel: [214219.480799]  nmi_cpu_backtrace.cold.4+0x13/0x50
May 31 20:36:00 debian kernel: [214219.480803]  ? lapic_can_unplug_cpu.cold.31+0x37/0x37
May 31 20:36:00 debian kernel: [214219.480805]  nmi_trigger_cpumask_backtrace+0xf9/0xfb
May 31 20:36:00 debian kernel: [214219.480808]  rcu_dump_cpu_stacks+0x9b/0xcb
May 31 20:36:00 debian kernel: [214219.480810]  rcu_check_callbacks.cold.81+0x1db/0x335
May 31 20:36:00 debian kernel: [214219.480813]  ? tick_sched_do_timer+0x60/0x60
May 31 20:36:00 debian kernel: [214219.480815]  update_process_times+0x28/0x60
May 31 20:36:00 debian kernel: [214219.480817]  tick_sched_handle+0x22/0x60
May 31 20:36:00 debian kernel: [214219.480818]  tick_sched_timer+0x37/0x70
May 31 20:36:00 debian kernel: [214219.480820]  __hrtimer_run_queues+0x100/0x280
May 31 20:36:00 debian kernel: [214219.480823]  hrtimer_interrupt+0x100/0x220
May 31 20:36:00 debian kernel: [214219.480826]  smp_apic_timer_interrupt+0x6a/0x140
May 31 20:36:00 debian kernel: [214219.480827]  apic_timer_interrupt+0xf/0x20
May 31 20:36:00 debian kernel: [214219.480829]  </IRQ>
May 31 20:36:00 debian kernel: [214219.480831] RIP: 0010:smp_call_function_many+0x1f8/0x250
May 31 20:36:00 debian kernel: [214219.480832] Code: c7 e8 9c 4e 5f 00 3b 05 fa 5c 01 01 0f 83 8c fe ff ff 48 63 d0 48 8b 0b 48 03 0c d5 20 c7 8e ba 8b 51 18 $
May 31 20:36:00 debian kernel: [214219.480834] RSP: 0018:ffffaa14893e7d18 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
May 31 20:36:00 debian kernel: [214219.480835] RAX: 0000000000000006 RBX: ffff8a034e8e3080 RCX: ffff8a034e7a7320
May 31 20:36:00 debian kernel: [214219.480836] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8a034e8e3088
May 31 20:36:00 debian kernel: [214219.480837] RBPMay 31 23:05:21 debian systemd[1]: Starting Flush Journal to Persistent Storage...
May 31 23:05:21 debian systemd-tmpfiles[401]: [/usr/lib/tmpfiles.d/speech-dispatcher.conf:1] Line references path below legacy directory /var/run/ [...]

After googling xhci_hcd failure and seeing usb 5-2: USB disconnect I assume this error is related to one of my USB devices.

My question is; can I gather anything more than just that from these logs?

JeremyBoden 06-01-2020 08:58 PM

I'd suggest that the
Code:

rcu: INFO: rcu_sched self-detected stall on CPU
which took place about 6 seconds after the USB event is likely to be the cause (and not necessarily connected with the USB message).

If you can't replicate it, then its very hard to fix.

cordx 06-01-2020 09:34 PM

agreed that if you can't reproduce the error, it will be hard to troubleshoot.

you might get a more rounded picture of what was going on at the time with other log viewers like journalctl or dmesg. dmesg -H --level=err (=warn is also an option. you can use both with =err,warn). journalctl has a similar -p option where you can look at different priority levels like ""emerg" (0), "alert" (1), "crit" (2), "err" (3), "warning" (4), "notice" (5), "info" (6), "debug" (7)" (from the man page) as well as a way to filter for a certain time with --since and --until as shown on the page in the link.

shuv1t 06-01-2020 10:39 PM

Thank you for the replies. journalctl wouldn't show me anything from before the reset, dmesg returned "Watchdog hardware is disabled" error - also from after reset. I'm fine with the crash as long as it doesn't repeat, and if it does that makes it more likely I'll be able to pinpoint the cause.

syg00 06-01-2020 10:47 PM

Something plugged into USB3 didn't like not being answered. If you're lucky it might just be an unlikely coincidence.


All times are GMT -5. The time now is 10:24 PM.