100% Swap usage with loads of free memory
Hi everybody,
I wonder if anybody could help shed light on an obscure issue I faced on one of my servers this morning that very nearly caused it to crash. Nagios had alerted me that the server had a very high load average exceeding the critical level (17+), when logging onto the server I found that all 4GB of the swap was in use despite the fact that there was 15GB+ of free memory (and that's not even including memory from cache and buffers!) Because it seems all heavily used pages were being stored in swap, the I/O wait on the server became very high, and 4 kswapd daemons were taking up nearly 100% available CPU. This did coincide with an error reported by Bacula during a backup job while changing to a bad tape... From /var/log/bacula.log: Code:
10-Dec 02:11 bacula-sd JobId 1898: End of medium on Volume "4097" Bytes=434,170,000,000 Blocks=217,084 at 10-Dec-2010 02:11. Code:
Dec 10 03:51:47 07 kernel: Mem-info: Server Spec: 4 * Dual-Core AMD Opteron(tm) Processor 8214 32GB DDR2 ECC RAM RHEL 5.5, 2.6.18-194.11.3.el5 SMP x86_64 Running many KVM VMs (All CentOS x64) and kksmd is used. bacula-dir Version: 5.0.0 IBM Tape Drive using lin_tape module version 1.34.0 according to modinfo And before anybody asks ;) # sysctl vm.swappiness vm.swappiness = 10 Any insight would be greatly appreciated. |
The messages lines look like part of a kernel OOPS or BUG. Can you post their start? A snapshot of "free" and "top' would also have helped. For example "top" would have told us if your load was in IOWAIT (which it very well might have been due to waiting for the bad tape).
|
1 Attachment(s)
Attached is the full log from /var/log/messages at the time the problem occurred until it was fixed.
I think the first few stack traces in this is just because the bacula process is hung as it is waiting for the tape to wind to the correct location, I don't think because it has actually crashed. There are then a few page allocation failures from the lin_tape module, maybe this could be the culprit? Unfortunately I didn't think to save the outputs for top or free. I do remember that the iowait% was very high, maybe around 80%. I don't think it would have been caused by bacula waiting for the tape, an iostat would have confirmed it... but I think the iowait was due to memory being paged from swap to ram. Also I do remember that there were 4 processes (kswapd0-3) which were taking approx 99% CPU each. The free output also showed that swap memory was 100% used and there was at least 15GB of ram free excluding buffers and cache. |
Got the problem occurring again... help!
kswapd0 has a high CPU usage and is causing a 7% iowait. Server is getting a bit sluggish but nothing critical. If I enable swap with swapon, the swap gets written to very quickly and that makes the problem worse. I disabled the swap again with swapoff. The cache seems to be increasing very quickly which is unusual for this server. Here is the output from top: Code:
top - 10:27:00 up 56 days, 1:18, 2 users, load average: 4.90, 5.16, 4.95 Before this happened... I removed the lin_tape module and updated to version 1.48 of the lin_tape module and daemon. I them restarted bacula and started a quick job to test the tape. The tape was loaded in the changer and the tape started to wind to the correct location, at this point the kswapd0 process started taking about 50% CPU and the cache started increasing. I tried stopping bacula-sd with no success, the last resort is to rmmod lin_tape... Help :) |
Well there has been an interesting turn of events...
After stopping all non-critical VMs and services on the server, the kswapd daemons all went down to 0% CPU and 0% iowait. I then restarted bacula and performed a backup. Interestingly I noticed as soon as bacula-sd started winding the tape to the correct location, all cache was purged from memory. I'm thinking this cache purge is what caused kswapd to start aggressively trying to cache memory again causing a high load on the server. Well bacula finished the backup successfully and eventually when the cache filled to around 600MB, kswapd went down to 0% CPU and 0% iowait again. I restarted all of the services and VMs I had shutdown earlier and now the server is back to normal :D I won't call this solved yet because when bacula decides to change tapes next this could all kick off again... we will see. BTW, my understanding of kswapd and memory cache may be way off. If it is I hope someone will enlighten me :) |
kswapd doesn't deal with (page) cache directly - and certainly doesn't "cache memory again" (if you mean what I think you might mean).
high kswapd usage is (usually) a symptom of another deeper problem - it's trying to either; - swap pages out due to a low memory event - or swap pages in due to storage reference causing a page fault on a swapped out page. Which can, of course, lead to thrashing. You look like you're having problems in the DMA zone I can't imagine you'd ever want to see "all_unreclaimable? yes" - the lin_tape might indeed be a good first suspect. Of course it could be an innocent bystander trying to help, just like kswapd. Stopping the VMs having that effect looks a bit suspect. I wonder if this is exacerbated by the AMD processors being separate nodes - means the memory is split up as well, so no task (bacula) has access to all the (physical) DMA{32} region. |
All times are GMT -5. The time now is 09:51 AM. |