LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 12-10-2010, 04:13 AM   #1
AlienWolf
LQ Newbie
 
Registered: Dec 2010
Distribution: Ubuntu, RHEL, CentOS
Posts: 7

Rep: Reputation: 0
Question 100% Swap usage with loads of free memory


Hi everybody,

I wonder if anybody could help shed light on an obscure issue I faced on one of my servers this morning that very nearly caused it to crash.

Nagios had alerted me that the server had a very high load average exceeding the critical level (17+), when logging onto the server I found that all 4GB of the swap was in use despite the fact that there was 15GB+ of free memory (and that's not even including memory from cache and buffers!)

Because it seems all heavily used pages were being stored in swap, the I/O wait on the server became very high, and 4 kswapd daemons were taking up nearly 100% available CPU.

This did coincide with an error reported by Bacula during a backup job while changing to a bad tape... From /var/log/bacula.log:

Code:
10-Dec 02:11 bacula-sd JobId 1898: End of medium on Volume "4097" Bytes=434,170,000,000 Blocks=217,084 at 10-Dec-2010 02:11.
10-Dec 02:11 bacula-sd JobId 1898: 3307 Issuing autochanger "unload slot 4097, drive 0" command.
10-Dec 02:12 bacula-sd JobId 1898: 3301 Issuing autochanger "loaded? drive 0" command.
10-Dec 02:12 bacula-sd JobId 1898: 3302 Autochanger "loaded? drive 0", result: nothing loaded.
10-Dec 02:12 bacula-sd JobId 1898: 3304 Issuing autochanger "load slot 4096, drive 0" command.
10-Dec 02:13 bacula-sd JobId 1898: 3305 Autochanger "load slot 4096, drive 0", status is OK.
10-Dec 02:13 bacula-sd JobId 1898: Volume "4096" previously written, moving to end of data.
10-Dec 03:51 bacula-sd JobId 1898: Error: Unable to position to end of data on device "Tape-1" (/dev/IBMtape0n): ERR=dev.c:1384 read e
rror on "Tape-1" (/dev/IBMtape0n). ERR=Input/output error.

10-Dec 03:51 bacula-sd JobId 1898: Marking Volume "4096" in Error in Catalog.
10-Dec 03:51 bacula-sd JobId 1898: 3307 Issuing autochanger "unload slot 4096, drive 0" command.
10-Dec 03:58 bacula-sd JobId 1898: 3301 Issuing autochanger "loaded? drive 0" command.
10-Dec 03:58 bacula-sd JobId 1898: 3302 Autochanger "loaded? drive 0", result: nothing loaded.
10-Dec 03:58 bacula-sd JobId 1898: 3304 Issuing autochanger "load slot 4098, drive 0" command.
10-Dec 03:58 bacula-sd JobId 1898: 3305 Autochanger "load slot 4098, drive 0", status is OK.
10-Dec 03:59 bacula-sd JobId 1898: Wrote label to prelabeled Volume "4098" on device "Tape-1" (/dev/IBMtape0n)
10-Dec 03:59 bacula-sd JobId 1898: New volume "4098" mounted on device "Tape-1" (/dev/IBMtape0n) at 10-Dec-2010 03:59.
At the same time, these messages starting occuring in /var/log/messages:

Code:
Dec 10 03:51:47 07 kernel: Mem-info:
Dec 10 03:51:47 07 kernel: Node 0 DMA per-cpu:
Dec 10 03:51:47 07 kernel: cpu 0 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 0 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 1 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 1 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 2 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 2 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 3 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 3 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 4 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 4 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 5 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 5 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 6 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 6 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 7 hot: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: cpu 7 cold: high 0, batch 1 used:0
Dec 10 03:51:47 07 kernel: Node 0 DMA32 per-cpu:
Dec 10 03:51:47 07 kernel: cpu 0 hot: high 186, batch 31 used:162
Dec 10 03:51:47 07 kernel: cpu 0 cold: high 62, batch 15 used:48
Dec 10 03:51:47 07 kernel: cpu 1 hot: high 186, batch 31 used:0
Dec 10 03:51:47 07 kernel: cpu 1 cold: high 62, batch 15 used:0
Dec 10 03:51:47 07 kernel: cpu 2 hot: high 186, batch 31 used:0
Dec 10 03:51:47 07 kernel: cpu 2 cold: high 62, batch 15 used:0
Dec 10 03:51:47 07 kernel: cpu 3 hot: high 186, batch 31 used:18
Dec 10 03:51:47 07 kernel: cpu 3 cold: high 62, batch 15 used:0
Dec 10 03:51:47 07 kernel: cpu 4 hot: high 186, batch 31 used:159
Dec 10 03:51:47 07 kernel: cpu 4 cold: high 62, batch 15 used:56
...
Dec 10 03:51:47 07 kernel: Node 3 HighMem per-cpu: empty
Dec 10 03:51:47 07 kernel: Free pages:      732052kB (0kB HighMem)
Dec 10 03:51:47 07 kernel: Active:4232128 inactive:3071288 dirty:158210 writeback:0 unstable:0 free:183320 slab:256840 mapped-file:289545 mapped-anon:3805487 pagetables:13063
Dec 10 03:51:47 07 kernel: Node 0 DMA free:10796kB min:4kB low:4kB high:4kB active:0kB inactive:0kB present:10356kB pages_scanned:0 all_unreclaimable? yes
Dec 10 03:51:47 07 kernel: lowmem_reserve[]: 0 3512 9067 9067
Dec 10 03:51:47 07 kernel: Node 0 DMA32 free:213332kB min:2500kB low:3124kB high:3748kB active:1794108kB inactive:1463220kB present:3596296kB pages_scanned:64 all_unreclaimable? no
Dec 10 03:51:47 07 kernel: lowmem_reserve[]: 0 0 5555 5555
Dec 10 03:51:47 07 kernel: Node 0 Normal free:41028kB min:3952kB low:4940kB high:5928kB active:3409444kB inactive:1471120kB present:5688320kB pages_scanned:0 all_unreclaimable? no
Dec 10 03:51:47 07 kernel: lowmem_reserve[]: 0 0 0 0
Dec 10 03:51:47 07 kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Dec 10 03:51:47 07 kernel: lowmem_reserve[]: 0 0 0 0
Dec 10 03:51:47 07 kernel: Node 1 DMA free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
...
Well to cut a long story short, I fixed the problem by disabling the swap partition with 'swapoff'. After about 30 mins all the swap was freed and the server went back to normal. I don't dare reactivate the swap partition and unfortunately as this is a live server which currently has no fail over, I can't reboot either

Server Spec:
4 * Dual-Core AMD Opteron(tm) Processor 8214
32GB DDR2 ECC RAM
RHEL 5.5, 2.6.18-194.11.3.el5 SMP x86_64
Running many KVM VMs (All CentOS x64) and kksmd is used.
bacula-dir Version: 5.0.0
IBM Tape Drive using lin_tape module version 1.34.0 according to modinfo

And before anybody asks
# sysctl vm.swappiness
vm.swappiness = 10

Any insight would be greatly appreciated.
 
Old 12-10-2010, 07:07 AM   #2
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,647

Rep: Reputation: 524Reputation: 524Reputation: 524Reputation: 524Reputation: 524Reputation: 524
The messages lines look like part of a kernel OOPS or BUG. Can you post their start? A snapshot of "free" and "top' would also have helped. For example "top" would have told us if your load was in IOWAIT (which it very well might have been due to waiting for the bad tape).
 
Old 12-13-2010, 03:32 AM   #3
AlienWolf
LQ Newbie
 
Registered: Dec 2010
Distribution: Ubuntu, RHEL, CentOS
Posts: 7

Original Poster
Rep: Reputation: 0
Attached is the full log from /var/log/messages at the time the problem occurred until it was fixed.

I think the first few stack traces in this is just because the bacula process is hung as it is waiting for the tape to wind to the correct location, I don't think because it has actually crashed. There are then a few page allocation failures from the lin_tape module, maybe this could be the culprit?

Unfortunately I didn't think to save the outputs for top or free. I do remember that the iowait% was very high, maybe around 80%. I don't think it would have been caused by bacula waiting for the tape, an iostat would have confirmed it... but I think the iowait was due to memory being paged from swap to ram. Also I do remember that there were 4 processes (kswapd0-3) which were taking approx 99% CPU each. The free output also showed that swap memory was 100% used and there was at least 15GB of ram free excluding buffers and cache.
Attached Files
File Type: txt messages.txt (44.5 KB, 5 views)
 
Old 12-14-2010, 04:30 AM   #4
AlienWolf
LQ Newbie
 
Registered: Dec 2010
Distribution: Ubuntu, RHEL, CentOS
Posts: 7

Original Poster
Rep: Reputation: 0
Got the problem occurring again... help!

kswapd0 has a high CPU usage and is causing a 7% iowait. Server is getting a bit sluggish but nothing critical.
If I enable swap with swapon, the swap gets written to very quickly and that makes the problem worse. I disabled the swap again with swapoff.
The cache seems to be increasing very quickly which is unusual for this server. Here is the output from top:

Code:
top - 10:27:00 up 56 days,  1:18,  2 users,  load average: 4.90, 5.16, 4.95
Tasks: 239 total,   3 running, 236 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2%us, 27.2%sy,  0.0%ni, 66.7%id,  5.6%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  33032524k total, 22336544k used, 10695980k free,    76996k buffers
Swap:        0k total,        0k used,        0k free,  5817564k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
19416 root      15   0 4476m 4.1g 664m S 70.7 12.9   2478:03 qemu-kvm                                                                 
  573 root      11  -5     0    0    0 D 51.5  0.0  94:08.77 kswapd0                                                                  
 2968 root      10  -5     0    0    0 R 37.9  0.0  14073:21 kksmd                                                                    
 6661 root      15   0 2804m 2.5g 274m S 37.9  8.0   7042:30 qemu-kvm                                                                 
19559 root      15   0 4892m 1.7g 703m S 12.6  5.5 264:29.56 qemu-kvm                                                                 
 6430 root      15   0 2791m 2.5g 419m S  7.6  8.0   3803:21 qemu-kvm                                                                 
 5106 root      10  -5     0    0    0 S  1.0  0.0  23:42.15 gfs2_logd                                                                
    1 root      15   0 10352  196   88 S  0.0  0.0   0:34.74 init                                                                     
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:02.07 migration/0                                                              
    3 root      34  19     0    0    0 S  0.0  0.0   0:01.19 ksoftirqd/0                                                              
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0                                                               
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:20.57 migration/1                                                              
    6 root      34  19     0    0    0 S  0.0  0.0   0:27.44 ksoftirqd/1                                                              
    7 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/1                                                               
    8 root      RT  -5     0    0    0 S  0.0  0.0   0:34.27 migration/2                                                              
    9 root      34  19     0    0    0 S  0.0  0.0   0:01.77 ksoftirqd/2                                                              
   10 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/2                                                               
   11 root      RT  -5     0    0    0 S  0.0  0.0   0:54.93 migration/3                                                              
   12 root      34  19     0    0    0 S  0.0  0.0   0:03.16 ksoftirqd/3                                                              
   13 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/3                                                               
   14 root      RT  -5     0    0    0 S  0.0  0.0   0:06.03 migration/4                                                              
   15 root      34  19     0    0    0 S  0.0  0.0   0:15.67 ksoftirqd/4                                                              
   16 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/4                                                               
   17 root      RT  -5     0    0    0 S  0.0  0.0   0:01.04 migration/5                                                              
   18 root      34  19     0    0    0 S  0.0  0.0   0:01.70 ksoftirqd/5                                                              
   19 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/5                                                               
   20 root      RT  -5     0    0    0 S  0.0  0.0   0:00.30 migration/6
Just imagine the cache increasing at about 1MB a second.

Before this happened...
I removed the lin_tape module and updated to version 1.48 of the lin_tape module and daemon.
I them restarted bacula and started a quick job to test the tape.
The tape was loaded in the changer and the tape started to wind to the correct location, at this point the kswapd0 process started taking about 50% CPU and the cache started increasing. I tried stopping bacula-sd with no success, the last resort is to rmmod lin_tape...

Help

Last edited by AlienWolf; 12-14-2010 at 04:32 AM.
 
Old 12-14-2010, 12:21 PM   #5
AlienWolf
LQ Newbie
 
Registered: Dec 2010
Distribution: Ubuntu, RHEL, CentOS
Posts: 7

Original Poster
Rep: Reputation: 0
Well there has been an interesting turn of events...

After stopping all non-critical VMs and services on the server, the kswapd daemons all went down to 0% CPU and 0% iowait.
I then restarted bacula and performed a backup. Interestingly I noticed as soon as bacula-sd started winding the tape to the correct location, all cache was purged from memory.
I'm thinking this cache purge is what caused kswapd to start aggressively trying to cache memory again causing a high load on the server.
Well bacula finished the backup successfully and eventually when the cache filled to around 600MB, kswapd went down to 0% CPU and 0% iowait again.
I restarted all of the services and VMs I had shutdown earlier and now the server is back to normal

I won't call this solved yet because when bacula decides to change tapes next this could all kick off again... we will see.

BTW, my understanding of kswapd and memory cache may be way off. If it is I hope someone will enlighten me
 
Old 12-14-2010, 08:33 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,353

Rep: Reputation: 1042Reputation: 1042Reputation: 1042Reputation: 1042Reputation: 1042Reputation: 1042Reputation: 1042Reputation: 1042
kswapd doesn't deal with (page) cache directly - and certainly doesn't "cache memory again" (if you mean what I think you might mean).

high kswapd usage is (usually) a symptom of another deeper problem - it's trying to either;
- swap pages out due to a low memory event
- or swap pages in due to storage reference causing a page fault on a swapped out page.
Which can, of course, lead to thrashing.

You look like you're having problems in the DMA zone I can't imagine you'd ever want to see "all_unreclaimable? yes" - the lin_tape might indeed be a good first suspect. Of course it could be an innocent bystander trying to help, just like kswapd. Stopping the VMs having that effect looks a bit suspect.

I wonder if this is exacerbated by the AMD processors being separate nodes - means the memory is split up as well, so no task (bacula) has access to all the (physical) DMA{32} region.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why doesnt top or free show swap memory usage ? exceed1 Linux - General 15 08-03-2009 08:38 PM
Memory and swap usage ? czezz Solaris / OpenSolaris 6 09-03-2008 04:51 AM
Memory / Swap Usage RoaCh Of DisCor Linux - Newbie 16 05-19-2005 02:45 PM
memory/swap usage koy Linux - Newbie 2 05-18-2004 11:12 AM
Memory Usage 100% XPediTioN Mandriva 1 01-27-2004 02:35 PM


All times are GMT -5. The time now is 05:19 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration