LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Virtualization and Cloud
User Name
Password
Linux - Virtualization and Cloud This forum is for the discussion of all topics relating to Linux Virtualization and Linux Cloud platforms. Xen, KVM, OpenVZ, VirtualBox, VMware, Linux-VServer and all other Linux Virtualization platforms are welcome. OpenStack, CloudStack, ownCloud, Cloud Foundry, Eucalyptus, Nimbus, OpenNebula and all other Linux Cloud platforms are welcome. Note that questions relating solely to non-Linux OS's should be asked in the General forum.

Notices


Reply
  Search this Thread
Old 08-11-2020, 10:07 PM   #1
huyuhui
Member
 
Registered: Sep 2004
Posts: 31

Rep: Reputation: 0
NMI watchdog: BUG: soft lockup


OS: SLES 15.1, which is a KVM guest machine
Kernel: 4.12.14-197.51-default
Issue: NMI watchdog: BUG: soft lockup

I tried below but issue is still there.
Code:
echo 20 > /proc/sys/kernel/watchdog_thresh
Please advise. Thanks

James
 
Old 08-11-2020, 11:33 PM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
You've provided absolutely no useful information regarding the circumstances under which this problem occurs, so I don't see how anybody could provide a suggestion, much less an answer.

Now, I happened to see your near-identical post in the SuSE forum where you actually did provide some information: That you're experiencing the watchdog bug message while transferring large-ish files via SFTP. Also, you provided a link to the page from where you'd gotten the advice to increase the watchdog threshold.

Here's the deal: You're seeing this issue because your VM is starved for resources. While transferring the file, the kernel is stuck somewhere for longer than the watchdog timeout allows, so the watchdog considers it a soft lockup and kicks in.

How many virtual CPUs does this VM have? If the answer is "one", you should definitely try adding one more. But perhaps more importantly, what does top (or htop) report with regards to memory usage while the file transfer is occurring? If the VM starts swapping, that could explain why it becomes unresponsive to the point where the watchdog is triggered. If neither CPU nor memory usage seems abnormal, you should check the performance of the disk subsystem holding the filesystem onto which you're writing the file.

Of course, you could try just increasing the watchdog timeout value to 30 or 40 seconds or even longer until the watchdog is no longer triggered, but that would not address the underlying issue.
 
Old 08-12-2020, 05:30 AM   #3
huyuhui
Member
 
Registered: Sep 2004
Posts: 31

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Ser Olmy View Post
You've provided absolutely no useful information regarding the circumstances under which this problem occurs, so I don't see how anybody could provide a suggestion, much less an answer.

Now, I happened to see your near-identical post in the SuSE forum where you actually did provide some information: That you're experiencing the watchdog bug message while transferring large-ish files via SFTP. Also, you provided a link to the page from where you'd gotten the advice to increase the watchdog threshold.

Here's the deal: You're seeing this issue because your VM is starved for resources. While transferring the file, the kernel is stuck somewhere for longer than the watchdog timeout allows, so the watchdog considers it a soft lockup and kicks in.

How many virtual CPUs does this VM have? If the answer is "one", you should definitely try adding one more. But perhaps more importantly, what does top (or htop) report with regards to memory usage while the file transfer is occurring? If the VM starts swapping, that could explain why it becomes unresponsive to the point where the watchdog is triggered. If neither CPU nor memory usage seems abnormal, you should check the performance of the disk subsystem holding the filesystem onto which you're writing the file.

Of course, you could try just increasing the watchdog timeout value to 30 or 40 seconds or even longer until the watchdog is no longer triggered, but that would not address the underlying issue.
Thank you for your reply and comments。
There are some VMs in the host as well.

Below is the status of top command in the host.
Code:
top - 18:20:54 up 569 days,  7:14,  2 users,  load average: 1.13, 1.05, 1.07
Tasks: 979 total,   1 running, 978 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.1%us,  0.3%sy,  0.0%ni, 98.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2068091M total,  1008766M used,  1059324M free,      195M buffers
Swap:     2053M total,        0M used,     2053M free,   737135M cached
I temporarily set below parameters to zero to see if it'd be helpful.
Code:
/proc/sys/kernel/tainted
/proc/sys/kernel/watchdog
/proc/sys/kernel/nmi_watchdog
/proc/sys/kernel/soft_watchdog
/proc/sys/kernel/softlockup_panic
/proc/sys/kernel/unknown_nmi_panic
About the performance of the disk subsystem that the VM guest resides, could you please share your thoughts more? I did not get the point of potential impact.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
NMI watchdog: BUG: soft lockup huyuhui SUSE / openSUSE 4 08-11-2020 10:01 PM
Watchdog soft lockup stuck for 22 seconds Kartik77 Linux - Newbie 1 12-13-2018 09:31 PM
MSI nmi watchdog hard lockup on cpu X brick TheRealRustyShackleford Linux - Laptop and Netbook 2 12-08-2018 06:34 AM
kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [ipset:22091] anis123 Linux - Server 1 05-10-2018 04:10 PM
BUG: soft lockup detected on CPU#0 and BUG: spinlock recursion on CPU#0 ... BloodyCat Linux - Hardware 3 11-07-2006 01:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Virtualization and Cloud

All times are GMT -5. The time now is 10:54 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration