During a load and unload of my kernel module, I see the following message:
[ 591.578557] Uhhuh. NMI received for unknown reason 21 on CPU 0.
[ 591.578557] Do you have a strange power saving mode enabled?
[ 591.578557] Dazed and confused, but trying to continue
I have been trying to debug on my own, to no avail. I am running Centos
[root@bitterroot ~]# uname -a
Linux bitterroot 3.10.0-123.20.1.el7.acpi_debug2.x86_64 #1 SMP Tue Mar
31 09:43:45 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@bitterroot ~]#
I am running a test, where I repeatedly load and unload my kernel
module for a pcie gen3x8 network interface card.
I have no acpi kernel background and want to better understand what might the possible hardware/firmware issues be that would trigger this failure.
I have tried many acpi debug_layer and debug_level flags and
unfortunately do not see any unusual behavior when the test passes versus the debug state of acpi when the failure happens
I do know "Daze and confused happens" as part of loading my module.
I do also notice some time later, as part of unload, I see an lspci
completion timeout occur on the root port where my nic is attached.
I have attached dmesg output, lspci, dsdt.dsl from my system.
I am totally stumped on this one and not sure what I can do as next
debug steps. I am hoping you have some idea on things I can try to
debug further.
https://drive.google.com/folderview?...&usp=drive_web
Thank you in advance.