LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 06-05-2011, 07:36 PM   #1
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Rep: Reputation: 54
Linux system randomly locks up and slows to a crawl


I noticed that sometimes my Linux server will randomly start to lag really badly, to the point where even a http request takes forever.

It is an Intel Core2Quad with 8GB of ram running FC9.

This is my "Everything" server so it does file, email, DNS, web (for local stuff only), VMs and so on.

There are about 5-6 VMs running on it at any given time. I manage it through a VNC session and have some SSH consoles within that session. This way if I reboot my PC I don't lose all my SSH consoles. If I need to SSH to any server I do it from there. I treat it kinda like a terminal server to some extent.

When this slowdown happens, top is not really useful because I also do F@H so that will always be to the top, but it's low priority. The VMs are also always near the top. This does not change whether it's slow or not, so when it's slow, I have nothing to go by on how to troubleshoot. The load does seem to skyrocket though. Right now it's doing the slowdown thing and it's at 8.09. Normally it's at around 3 which imo is good as it it's under 4. I have 4 cores so anything more than 4 means it's queuing. At least that's how I understand it.

So how do I go about troubleshooting this?

This is the output of top:

Code:
[root@borg ~]# 
[root@borg ~]# 
[root@borg ~]# 
[root@borg ~]# top

top - 20:35:25 up 2 days, 23:19,  1 user,  load average: 8.51, 7.83, 7.68
Tasks: 243 total,   2 running, 241 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.8%us, 71.0%sy,  9.6%ni,  3.7%id, 10.5%wa,  0.2%hi,  1.2%si,  0.0%st
Mem:   7926980k total,  7883344k used,    43636k free,     6852k buffers
Swap: 164095932k total,   413968k used, 163681964k free,  2365872k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                         
13231 root      39  19  315m  91m 1560 S 171.3  1.2   4612:03 FahCore_a3.exe                                                                                 
 3033 vmuser    20   0 3301m 2.7g 2.7g S 91.1 35.9   3245:50 VirtualBox                                                                                      
 3051 vmuser    20   0 1289m 756m 730m S 50.6  9.8 266:36.53 VirtualBox                                                                                      
 3014 vmuser    20   0 1165m 640m 627m S 11.5  8.3 639:57.66 VirtualBox                                                                                      
25856 p2puser   20   0  167m 1812 1272 D  5.6  0.0   6:34.33 smbd                                                                                            
 3070 vmuser    20   0 2445m 1.9g 1.9g S  5.3 24.8 351:32.76 VirtualBox                                                                                      
 1131 root      15  -5     0    0    0 S  2.0  0.0  22:25.85 md0_raid5                                                                                       
 2673 vmuser    20   0  103m  37m 4304 S  1.7  0.5   6:15.23 Xvnc                                                                                            
 3088 vmuser    20   0  702m 176m 148m S  1.7  2.3  80:47.02 VirtualBox                                                                                      
 3106 vmuser    20   0  859m 337m 313m S  1.7  4.4 162:12.41 VirtualBox                                                                                      
 2862 vmuser    20   0  317m  11m 5004 R  1.1  0.2   2:13.33 gnome-terminal                                                                                  
   36 root      15  -5     0    0    0 S  0.6  0.0   2:14.40 kswapd0                                                                                         
 1222 root      15  -5     0    0    0 S  0.6  0.0   8:11.46 kjournald                                                                                       
 1449 root      15  -5     0    0    0 S  0.3  0.0   0:40.06 kondemand/2                                                                                     
 2716 vmuser    20   0  324m 3032 1780 S  0.3  0.0   0:00.44 gnome-settings-                                                                                 
15039 root      20   0  198m 4240 1532 S  0.3  0.1   0:10.30 spamd                                                                                           
15041 email_re  20   0  205m 7100 1744 D  0.3  0.1   0:22.76 spamd                                                                                           
26157 root      20   0 14840 1244  852 R  0.3  0.0   0:00.11 top                                                                                             
    1 root      20   0  4056  324  324 S  0.0  0.0   0:00.52 init                                                                                            
    2 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kthreadd                                                                                        
    3 root      RT  -5     0    0    0 S  0.0  0.0   0:00.14 migration/0                                                                                     
    4 root      15  -5     0    0    0 S  0.0  0.0   0:00.20 ksoftirqd/0                                                                                     
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:00.47 watchdog/0                                                                                      
    6 root      RT  -5     0    0    0 S  0.0  0.0   0:00.14 migration/1                                                                                     
    7 root      15  -5     0    0    0 S  0.0  0.0   0:00.14 ksoftirqd/1                                                                                     
    8 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/1                                                                                      
    9 root      RT  -5     0    0    0 S  0.0  0.0   0:00.11 migration/2                                                                                     
   10 root      15  -5     0    0    0 S  0.0  0.0   0:00.20 ksoftirqd/2                                                                                     
   11 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/2                                                                                      
   12 root      RT  -5     0    0    0 S  0.0  0.0   0:00.11 migration/3                                                                                     
   13 root      15  -5     0    0    0 S  0.0  0.0   0:00.11 ksoftirqd/3                                                                                     
   14 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/3                                                                                      
   15 root      15  -5     0    0    0 S  0.0  0.0   0:02.69 events/0                                                                                        
   16 root      15  -5     0    0    0 S  0.0  0.0   0:04.73 events/1                                                                                        
   17 root      15  -5     0    0    0 S  0.0  0.0   0:01.22 events/2                                                                                        
   18 root      15  -5     0    0    0 S  0.0  0.0   0:01.26 events/3                                                                                        
   19 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 khelper                                                                                         
   20 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/0                                                                                   
   21 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/1                                                                                   
   22 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/2                                                                                   
   23 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/3                                                                                   
   24 root      15  -5     0    0    0 S  0.0  0.0   0:06.44 kblockd/0                                                                                       
   25 root      15  -5     0    0    0 S  0.0  0.0   0:09.00 kblockd/1                                                                                       
   26 root      15  -5     0    0    0 S  0.0  0.0   0:08.09 kblockd/2                                                                                       
   27 root      15  -5     0    0    0 S  0.0  0.0   0:07.26 kblockd/3                                                                                       
   28 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid                                                                                          
   29 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kacpi_notify                                                                                    
   30 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue                                                                                          
   31 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 ksuspend_usbd                                                                                   
   32 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 khubd                                                                                           
   33 root      15  -5     0    0    0 S  0.0  0.0   0:00.00 kseriod
 
Old 06-06-2011, 08:26 AM   #2
neonsignal
Senior Member
 
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Bookworm (Fluxbox WM)
Posts: 1,391
Blog Entries: 54

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
The fact that half a gig of swap has been used indicates that at some point the system has run out of memory; that might be noticeable if it was a VM that had been swapped out. You would get a high load average during the time it was thrashing the drive. The iostat utility can be useful for monitoring the drive access.

I also observe that one of the VMs has used a surprisingly high amount of CPU time (ie, it is running about 75% of the time). Being a quad processor, that isn't going to cause much variation in the responsiveness, but it seems excessive.

You don't say how long the slowdown happens for, or what operating systems are running in the VMs.

Last edited by neonsignal; 06-06-2011 at 08:28 AM.
 
Old 06-06-2011, 06:10 PM   #3
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
The slowdown usually happens for maybe half an hour. Usually out of despair I start rebooting stuff and eventually it gets responsive again.

The VMs are running various OSes, win2k3, winxp, CentOS, etc.

Normally, should swap be at zero? Maybe ram is my issue, which is kinda what I suspect as I'm running borderline.
 
Old 06-06-2011, 06:56 PM   #4
neonsignal
Senior Member
 
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Bookworm (Fluxbox WM)
Posts: 1,391
Blog Entries: 54

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
Normally swap would be at zero. It isn't critical on a desktop (since the application load can vary greatly), but things should be more predictable on a server.

How much base memory have the VMs each been allocated?

What is running in the VM (process 3033 in the above example) that would be using so much CPU (none of the applications you mentioned were CPU intensive)? You don't by any chance have a Win95/98/ME running in one of the VMs?

Last edited by neonsignal; 06-06-2011 at 06:59 PM.
 
Old 06-06-2011, 07:08 PM   #5
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
Well two of the VMs are environments for a game server one is dev the other is test, and the game is somewhat intensive. So think it may just be vm activity that causes it?
 
Old 06-06-2011, 07:42 PM   #6
neonsignal
Senior Member
 
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Bookworm (Fluxbox WM)
Posts: 1,391
Blog Entries: 54

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
Quote:
Originally Posted by Red Squirrel View Post
So think it may just be vm activity that causes it?
No, but if several of the VMs are being used intensively, and since you are using the whole memory, then you can expect thrashing of the swap file. One step would be to limit the base memory allocated to each of the VMs (if you haven't already) so that the total is less than the 8 gig; that way if a particular VM is using excessive memory (or has a memory leak in an application), it won't impact the other VMs.

Last edited by neonsignal; 06-06-2011 at 07:45 PM.
 
Old 06-06-2011, 07:52 PM   #7
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
Hmm I'll give that a try. I also have to compensate for the host itself.

I've been wanting to upgrade to a server with more ram, just don't have the cash. This one is maxed out.
 
Old 06-06-2011, 08:04 PM   #8
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
I'm starting to wonder if it's disk I/O related. I noticed 3 mdadm errors where it failed a drive then it just rebuilt. This happened maybe 10 days ago.

Also, here's an iostat, it's doing it again, and I have a backup job running.

Code:
[root@borg ~]# iostat -x
Linux 2.6.27.25-78.2.56.fc9.x86_64 (borg.loc) 	06/06/2011

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.41   39.37   58.87    0.09    0.00    0.26

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               2.88     1.55    2.81    1.64   138.95    25.53    37.03     0.19   42.11   2.12   0.94
sda1              0.00     0.00    0.00    0.00     0.01     0.00    19.44     0.00    5.79   5.68   0.00
sda2              2.53     0.66    1.41    0.89    31.51    12.42    19.11     0.17   73.97   2.94   0.68
sda3              0.35     0.89    1.40    0.75   107.43    13.10    56.25     0.02    7.93   2.42   0.52
sdb              19.39   125.86   16.77   19.26  1024.74  1180.20    61.20     2.06   56.22   1.87   6.72
sdc              19.15   126.65   16.54   19.45  1020.75  1188.98    61.38     2.03   55.17   1.85   6.65
sdd              19.48   128.71   16.67   19.49  1023.97  1206.29    61.67     2.19   59.23   1.87   6.75
sde              20.84   132.42   17.11   17.45  1038.92  1221.75    65.42     2.45   69.38   2.13   7.37
sdf              19.58   133.26   16.79   17.48  1025.73  1230.03    65.82     2.54   72.19   2.22   7.62
md0               0.00     0.00   61.27  554.03  4152.44  4432.27    13.95     0.00    0.00   0.00   0.00
sdh               0.46     7.48    1.09    0.14   268.20    60.95   267.73     0.68  555.47   5.51   0.68
sdh1              0.46     7.48    1.09    0.14   268.20    60.95   267.74     0.68  555.48   5.51   0.68

[root@borg ~]#
How do those numbers look? md0 is my main array where everything is stored.
 
Old 06-06-2011, 08:32 PM   #9
neonsignal
Senior Member
 
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Bookworm (Fluxbox WM)
Posts: 1,391
Blog Entries: 54

Rep: Reputation: 360Reputation: 360Reputation: 360Reputation: 360
I'm guessing from the figures that you are running a RAID5 with the 5 drives. If you were actually getting disk errors, you would expect the read/write rates to drop off. Anyway, you can use smartdisk to check the status of the drives. Indeed, the current iostat shows the disks running heavily. The %system usage is also high.

Last edited by neonsignal; 06-06-2011 at 08:57 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
/sbin/udevd -d running many times over, slows system to a crawl Blackhawkckc Linux - Newbie 2 04-08-2010 10:53 PM
linux, opensuse system slows down randomly. Guesses welcome sirlancealot Linux - Desktop 2 02-01-2009 05:18 PM
system locks randomly, in ubuntu, zenwalk, mepis wabbalee Linux - Hardware 2 09-25-2008 07:43 AM
suse 9.3 slows down to a crawl jf.vdbosch SUSE / openSUSE 5 03-22-2006 10:02 AM
System locks up randomly, and sometimes fails to boot Likosin Linux - Hardware 2 01-30-2006 06:07 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 03:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration