Ok. It's been 14 hours with no reply. I don't know the answer, but I can supply what I know, plus some speculation...
Quote:
On one of my servers the "free" command tells me that a lot of swap space are in use. What I'd like to do is to determine which processes have been swapped out.
|
You may be able to do this in a different manner than 'top', but it might be hard to make it reconcile with swap in use.
Quote:
I tried issuing "top" and sort by the "swap" column, but this doesn't seem to provide correct values - when performing the same excersize on another server with close to no pages swapped out, the sum when adding the swap value for each process greatly exceeds the swap usage reported by "free".
|
Here I can help. One thing that is ingenious about linux is that code is read-only; while variables within a program and malloc'd areas are read-write. You can see this if you 'cat /proc/<process-number>/maps'. Each executable is usually listed at least twice: the first one usually has r-xp attributes, this represents the "code segment"; another entry is usually rw-p and is a "data segment" containing variables. The malloc'd areas are rw-p and have 00:00 for a file descriptor.
Take the case where the system is running out of real storage and wants to steal a memory page frame. If that area is code (read-only), it does not need to write it out to swap space (there is already a copy of it on disk where it was originally loaded from); If the page frame that is being stolen has been modified then it needs to be written to swap. When a page fault occurs, if the area is code, it gets "paged" back in from the executable's location on disk; if the area is modified data, it gets paged back in from swap.
On top of that, once a page has been swapped in from swap space into real memory it is left out on swap. The rationale behind that is that if a page needs to be stolen in the future and the page has not been modified since being swapped in, the system can just steal that page again without doing any I/O. The page is not cleaned out of swap space unless swap is being exhausted or the process ends.
So, top's swap value will not equal used swap space. At times top's swap will be much larger than used swap space (mostly read-only pages stolen); and sometimes used swap space may seem bloated compared to what top says is swapped out (memory pressure relieved and everything swapped back into real memory). Or a mix of these conditions.
To complicate matters more (from an accounting perspective), shared objects (.so files) and other memory can be shared between processes...
Quote:
So how do I go about determining the swap space used for individual processes?
|
This is speculative (and I hope someone can correct me if I am wrong), but I think you can use the smaps data in the proc filesystem to determine what is out on swap. ('cat /proc/self/smaps' to see your own map.) I think that 'Size' and 'Rss' values can tell you how big a virtual memory area is and how much of it is in real storage. For swap usage, you'd only be interested in entries that were writable. And if it was out on swap and then swapped back in you may not be able to account for those pages left on swap.
An interesting thing to do on an unimportant server that has multiple swap filesystems and many heavy duty long running processes (like Oracle): after the system has used a lot of swap and the memory pressure has been relieved, 'swapoff' a swap space and watch how, as it gets moved to a different space, it gets trimmed down. The pages that faulted and were brought back into memory are not copied to the new swap space location.
Also, you'd have to figure out the copy-on-write mechanism ties into this.
But like I said I'm speculating here and might be completely off-base.
If you actually do figure this out I am certainly interested in your findings. But I think if it could be done someone would already have created a utility to do it.
Good luck.