Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
01-10-2021, 12:21 PM
|
#1
|
Member
Registered: Oct 2003
Distribution: Debian GNU/Linux 11 (amd64) w/kernel 6.0.15
Posts: 299
Rep:
|
Unable to use NVMe device on X570-P motherboard; refcount_t and percpu errors
I recently plugged a 256GB NVMe drive into the secondary slot on my Asus X570-P motherboard. After partitioning and formatting the drive, any I/O to the drive (including mounting its filesystem) causes the following errors to appear in the dmesg.
Code:
[ 121.698761] refcount_t: underflow; use-after-free.
[ 121.698772] WARNING: CPU: 8 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0xab/0xf0
[ 121.698773] Modules linked in: rfcomm(E) cmac(E) bnep(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) btusb(E) btrtl(E) btbcm(E) btintel(E) bluetooth(E) crct10dif_pclmul(E) crc32_pclmul(E) rfkill(E) ghash_clmulni_intel(E) jitterentropy_rng(E) aesni_intel(E) crypto_simd(E) cryptd(E) glue_helper(E) efi_pstore(E) drbg(E) ccp(E) ansi_cprng(E) ecdh_generic(E) ecc(E) acpi_cpufreq(E) nft_counter(E) efivarfs(E) crc32c_intel(E)
[ 121.698797] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G E 5.10.6-BET #1
[ 121.698798] Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 1405 11/19/2019
[ 121.698801] RIP: 0010:refcount_warn_saturate+0xab/0xf0
[ 121.698802] Code: 05 af d2 72 01 01 e8 7a 06 87 00 0f 0b c3 80 3d 9d d2 72 01 00 75 90 48 c7 c7 78 60 44 a6 c6 05 8d d2 72 01 01 e8 5b 06 87 00 <0f> 0b c3 80 3d 7c d2 72 01 00 0f 85 6d ff ff ff 48 c7 c7 d0 60 44
[ 121.698804] RSP: 0018:ffffa9d980394f30 EFLAGS: 00010086
[ 121.698805] RAX: 0000000000000000 RBX: ffff93c68f858900 RCX: 0000000000000027
[ 121.698806] RDX: 0000000000000027 RSI: ffff93cd7ec12e80 RDI: ffff93cd7ec12e88
[ 121.698807] RBP: ffff93c690bde200 R08: 0000000000000000 R09: c0000000ffffdfff
[ 121.698808] R10: ffffa9d980394d50 R11: ffffa9d980394d48 R12: 0000000000000001
[ 121.698809] R13: ffff93c6941f0600 R14: ffff93c68f78fa00 R15: 0000000000000000
[ 121.698810] FS: 0000000000000000(0000) GS:ffff93cd7ec00000(0000) knlGS:0000000000000000
[ 121.698811] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 121.698812] CR2: 00007f332be16000 CR3: 0000000106d4a000 CR4: 0000000000350ee0
[ 121.698813] Call Trace:
[ 121.698815] <IRQ>
[ 121.698818] nvme_irq+0x104/0x190
[ 121.698822] __handle_irq_event_percpu+0x2e/0xd0
[ 121.698824] handle_irq_event_percpu+0x33/0x80
[ 121.698825] handle_irq_event+0x39/0x70
[ 121.698827] handle_edge_irq+0x7c/0x1a0
[ 121.698830] asm_call_irq_on_stack+0x12/0x20
[ 121.698831] </IRQ>
[ 121.698834] common_interrupt+0xd7/0x160
[ 121.698836] asm_common_interrupt+0x1e/0x40
[ 121.698839] RIP: 0010:cpuidle_enter_state+0xd2/0x2e0
[ 121.698840] Code: e8 93 22 6a ff 31 ff 49 89 c5 e8 29 2c 6a ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 c4 01 00 00 31 ff e8 a2 d8 6f ff fb 45 85 f6 <0f> 88 c9 00 00 00 49 63 ce be 68 00 00 00 4c 2b 2c 24 48 89 ca 48
[ 121.698841] RSP: 0018:ffffa9d980177e80 EFLAGS: 00000202
[ 121.698842] RAX: ffff93cd7ec1ce00 RBX: 0000000000000002 RCX: 000000000000001f
[ 121.698843] RDX: 0000001c55cfa428 RSI: 00000000239f541c RDI: 0000000000000000
[ 121.698844] RBP: ffff93c68ea7f400 R08: 0000000000000002 R09: 000000000001c600
[ 121.698845] R10: 00000077d8356efc R11: ffff93cd7ec1be24 R12: ffffffffa66d38e0
[ 121.698846] R13: 0000001c55cfa428 R14: 0000000000000002 R15: 0000000000000000
[ 121.698849] cpuidle_enter+0x30/0x50
[ 121.698852] do_idle+0x24f/0x290
[ 121.698854] cpu_startup_entry+0x1b/0x20
[ 121.698857] start_secondary+0x10b/0x150
[ 121.698859] secondary_startup_64_no_verify+0xb0/0xbb
[ 121.698861] ---[ end trace 3cff32dbce8f0fd6 ]---
[ 151.779331] nvme nvme1: I/O 159 QID 9 timeout, aborting
[ 151.779344] nvme nvme1: I/O 160 QID 9 timeout, aborting
[ 151.779349] nvme nvme1: I/O 161 QID 9 timeout, aborting
[ 151.779354] nvme nvme1: I/O 162 QID 9 timeout, aborting
[ 151.779368] nvme nvme1: Abort status: 0x0
[ 151.779370] nvme nvme1: Abort status: 0x0
[ 151.779371] nvme nvme1: Abort status: 0x0
[ 151.779373] nvme nvme1: Abort status: 0x0
[ 151.779374] nvme nvme1: I/O 166 QID 9 timeout, aborting
[ 151.779378] nvme nvme1: I/O 167 QID 9 timeout, aborting
[ 151.779382] nvme nvme1: I/O 168 QID 9 timeout, aborting
[ 151.779387] nvme nvme1: Abort status: 0x0
[ 151.779389] nvme nvme1: Abort status: 0x0
[ 151.779390] nvme nvme1: I/O 169 QID 9 timeout, aborting
[ 151.779394] nvme nvme1: Abort status: 0x0
[ 151.779396] nvme nvme1: I/O 170 QID 9 timeout, aborting
[ 151.779402] nvme nvme1: Abort status: 0x0
[ 151.779403] nvme nvme1: I/O 171 QID 9 timeout, aborting
[ 151.779408] nvme nvme1: Abort status: 0x0
[ 151.779410] nvme nvme1: I/O 172 QID 9 timeout, aborting
[ 151.779415] nvme nvme1: Abort status: 0x0
[ 151.779416] nvme nvme1: I/O 173 QID 9 timeout, aborting
[ 151.779420] nvme nvme1: Abort status: 0x0
[ 151.779427] nvme nvme1: Abort status: 0x0
[ 181.987372] nvme nvme1: I/O 159 QID 9 timeout, reset controller
[ 182.015464] nvme nvme1: 15/0/0 default/read/poll queues
[ 212.195476] nvme nvme1: I/O 160 QID 9 timeout, disable controller
[ 212.313646] blk_update_request: I/O error, dev nvme1n1, sector 16350 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313653] blk_update_request: I/O error, dev nvme1n1, sector 16093 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313656] blk_update_request: I/O error, dev nvme1n1, sector 15836 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313658] blk_update_request: I/O error, dev nvme1n1, sector 15579 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313660] blk_update_request: I/O error, dev nvme1n1, sector 15322 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313662] blk_update_request: I/O error, dev nvme1n1, sector 15065 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313663] blk_update_request: I/O error, dev nvme1n1, sector 14808 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313665] blk_update_request: I/O error, dev nvme1n1, sector 14551 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313667] blk_update_request: I/O error, dev nvme1n1, sector 14294 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313669] blk_update_request: I/O error, dev nvme1n1, sector 14037 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 212.313702] nvme nvme1: failed to mark controller live state
[ 212.313705] nvme nvme1: Removing after probe failure status: -19
[ 212.323510] Aborting journal on device dm-0-8.
[ 212.323518] Buffer I/O error on dev dm-0, logical block 25198592, lost sync page write
[ 212.323521] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
Code:
[ 212.344431] percpu ref (hd_struct_free) <= 0 (-28) after switching to atomic
[ 212.344438] WARNING: CPU: 6 PID: 0 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x139/0x140
[ 212.344439] Modules linked in: rfcomm(E) cmac(E) bnep(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) btusb(E) btrtl(E) btbcm(E) btintel(E) bluetooth(E) crct10dif_pclmul(E) crc32_pclmul(E) rfkill(E) ghash_clmulni_intel(E) jitterentropy_rng(E) aesni_intel(E) crypto_simd(E) cryptd(E) glue_helper(E) efi_pstore(E) drbg(E) ccp(E) ansi_cprng(E) ecdh_generic(E) ecc(E) acpi_cpufreq(E) nft_counter(E) efivarfs(E) crc32c_intel(E)
[ 212.344452] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W E 5.10.6-BET #1
[ 212.344453] Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 1405 11/19/2019
[ 212.344454] RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x139/0x140
[ 212.344456] Code: 80 3d f9 f0 72 01 00 0f 85 52 ff ff ff 49 8b 54 24 e0 49 8b 74 24 e8 48 c7 c7 88 5f 44 a6 c6 05 db f0 72 01 01 e8 ad 24 87 00 <0f> 0b e9 2e ff ff ff 41 55 49 89 f5 41 54 55 48 89 fd 53 48 83 ec
[ 212.344456] RSP: 0018:ffffa9d98033cf20 EFLAGS: 00010282
[ 212.344457] RAX: 0000000000000000 RBX: 7fffffffffffffe3 RCX: 0000000000000027
[ 212.344457] RDX: 0000000000000027 RSI: ffff93cd7eb92e80 RDI: ffff93cd7eb92e88
[ 212.344458] RBP: 0000360c00c0c328 R08: 0000000000000000 R09: c0000000ffffdfff
[ 212.344458] R10: ffffa9d98033cd40 R11: ffffa9d98033cd38 R12: ffff93c68fb584a0
[ 212.344459] R13: ffffffffa6765f10 R14: 0000000000000202 R15: ffffffffa6606100
[ 212.344460] FS: 0000000000000000(0000) GS:ffff93cd7eb80000(0000) knlGS:0000000000000000
[ 212.344460] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 212.344461] CR2: 000055585e41cc28 CR3: 0000000107fe8000 CR4: 0000000000350ee0
[ 212.344461] Call Trace:
[ 212.344462] <IRQ>
[ 212.344465] rcu_core+0x196/0x420
[ 212.344468] __do_softirq+0xc9/0x214
[ 212.344469] asm_call_irq_on_stack+0x12/0x20
[ 212.344470] </IRQ>
[ 212.344471] do_softirq_own_stack+0x31/0x40
[ 212.344473] irq_exit_rcu+0x9a/0xa0
[ 212.344474] sysvec_apic_timer_interrupt+0x2c/0x80
[ 212.344475] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 212.344477] RIP: 0010:cpuidle_enter_state+0xd2/0x2e0
[ 212.344478] Code: e8 93 22 6a ff 31 ff 49 89 c5 e8 29 2c 6a ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 c4 01 00 00 31 ff e8 a2 d8 6f ff fb 45 85 f6 <0f> 88 c9 00 00 00 49 63 ce be 68 00 00 00 4c 2b 2c 24 48 89 ca 48
[ 212.344478] RSP: 0018:ffffa9d980167e80 EFLAGS: 00000202
[ 212.344479] RAX: ffff93cd7eb9ce00 RBX: 0000000000000001 RCX: 000000000000001f
[ 212.344479] RDX: 0000003170b6b110 RSI: 00000000239f541c RDI: 0000000000000000
[ 212.344480] RBP: ffff93c68ea7e000 R08: 0000000000000002 R09: 000000000001c600
[ 212.344480] R10: 000000c3ae2c0e44 R11: ffff93cd7eb9be24 R12: ffffffffa66d38e0
[ 212.344481] R13: 0000003170b6b110 R14: 0000000000000001 R15: 0000000000000000
[ 212.344483] cpuidle_enter+0x30/0x50
[ 212.344484] do_idle+0x24f/0x290
[ 212.344486] cpu_startup_entry+0x1b/0x20
[ 212.344487] start_secondary+0x10b/0x150
[ 212.344488] secondary_startup_64_no_verify+0xb0/0xbb
[ 212.344489] ---[ end trace 3cff32dbce8f0fd7 ]---
After these errors are thrown, the device becomes inaccessible and unmounting its filesystem generates additional errors:
Code:
[ 756.097787] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[ 756.097792] EXT4-fs (dm-0): I/O error while writing superblock
These errors occur with both the 5.9.15 kernel and the 5.10.6 kernel.
|
|
|
01-10-2021, 01:46 PM
|
#2
|
Senior Member
Registered: Aug 2016
Posts: 3,345
|
Looking at the manual for that board it seems the support for nvme can be either PCIe 3.0x4 or PCIe 4.0x4 depending on the processor installed.
I have a B550M board and on it the information in the manual explicitly states that the M.2-2 socket shares a bus with SATA 5 & 6 and that only one can be used.(either M.2-2 or Sata 5 & 6, but not both) Your card that has the problem is in M.2-2 and I wonder if it may be similar even though your manual does not state it that way. It may be worth the try and see if they are interfering by changing whatever SATA ports you are using.
|
|
1 members found this post helpful.
|
01-10-2021, 03:11 PM
|
#3
|
Member
Registered: Oct 2003
Distribution: Debian GNU/Linux 11 (amd64) w/kernel 6.0.15
Posts: 299
Original Poster
Rep:
|
Quote:
Originally Posted by computersavvy
Looking at the manual for that board it seems the support for nvme can be either PCIe 3.0x4 or PCIe 4.0x4 depending on the processor installed.
I have a B550M board and on it the information in the manual explicitly states that the M.2-2 socket shares a bus with SATA 5 & 6 and that only one can be used.(either M.2-2 or Sata 5 & 6, but not both) Your card that has the problem is in M.2-2 and I wonder if it may be similar even though your manual does not state it that way. It may be worth the try and see if they are interfering by changing whatever SATA ports you are using.
|
You've made an excellent point. I checked my motherboard manual and my hardware configuration, and it appears that the M.2_2 socket on the X570-P has its own dedicated connection to the X570 chipset, and does not share with the SATA ports. I don't have anything plugged into SATA5G or SATA6G anyway so I don't believe this issue is being caused by an underlying hardware mismatch.
|
|
|
01-17-2021, 10:31 AM
|
#4
|
Member
Registered: Oct 2003
Distribution: Debian GNU/Linux 11 (amd64) w/kernel 6.0.15
Posts: 299
Original Poster
Rep:
|
I updated my motherboard BIOS to version 3001 and modified some of the settings to ensure that the M.2_2 slot was properly configured for the NVMe drive, and I'm continuing to get the same kernel I/O errors. Other than the drive being defective in some way I'm not sure where else to check to see why this is happening.
|
|
|
01-17-2021, 12:08 PM
|
#5
|
Senior Member
Registered: Aug 2016
Posts: 3,345
|
Quote:
Originally Posted by TheOneKEA
I updated my motherboard BIOS to version 3001 and modified some of the settings to ensure that the M.2_2 slot was properly configured for the NVMe drive, and I'm continuing to get the same kernel I/O errors. Other than the drive being defective in some way I'm not sure where else to check to see why this is happening.
|
Does the BIOS see the drive properly?
Someone recently posted about a second drive that was acting flaky and found a (hidden) setting in the advanced bios that fixed the issue. I would look there and read the manual about the bios carefully in case you have a similar issue.
|
|
|
01-17-2021, 12:42 PM
|
#6
|
Member
Registered: Oct 2003
Distribution: Debian GNU/Linux 11 (amd64) w/kernel 6.0.15
Posts: 299
Original Poster
Rep:
|
Quote:
Originally Posted by computersavvy
Does the BIOS see the drive properly?
Someone recently posted about a second drive that was acting flaky and found a (hidden) setting in the advanced bios that fixed the issue. I would look there and read the manual about the bios carefully in case you have a similar issue.
|
Yes, the BIOS does see the drive properly. It has always been visible in the BIOS, even before I did the BIOS updates.
|
|
|
01-23-2021, 06:50 PM
|
#7
|
Member
Registered: Oct 2003
Distribution: Debian GNU/Linux 11 (amd64) w/kernel 6.0.15
Posts: 299
Original Poster
Rep:
|
After working with the NVMe maintainers, I was able to fix my drive by applying the following patch to my kernel source and recompiling:
Code:
diff -urN pci.c.orig pci.c
--- pci.c.orig 2021-01-20 21:24:32.124077095 -0500
+++ pci.c 2021-01-23 13:06:08.620757149 -0500
@@ -3219,6 +3219,8 @@
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(0x1d97, 0x2263), /* SPCC */
+ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
I haven't heard back yet from the NVMe maintainers to see if a patch like this one will be queued for inclusion in the Linux kernel.
|
|
|
01-25-2021, 02:51 PM
|
#8
|
Moderator
Registered: Mar 2008
Posts: 22,361
|
Thanks for the update and solution.
|
|
|
All times are GMT -5. The time now is 03:29 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|