LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Memory depletion on server running Tomcat; can't find the cause (https://www.linuxquestions.org/questions/linux-software-2/memory-depletion-on-server-running-tomcat%3B-cant-find-the-cause-856361/)

Subject16 01-14-2011 01:40 PM

Memory depletion on server running Tomcat; can't find the cause
 
Due to a ton of research I believe I now understand the output of ps, top, and free better than ever, and also have a relatively decent grasp on memory management (virtual address space, etc.) than I ever did before. With that being said, my server is super low on available memory and I can't make 1+1=2 on why it is. I suspect it's Tomcat/JVM (which I admittedly know precious little about). I am rebuilding this server (for a number of reasons) and plan to install 8GB but solving this mystery is key to supporting/promoting my design plans.

Relevant info:
OS= Ubuntu 8.10, 2.6.27-14-generic, 32-bit
Physical RAM installed= 4GB
Primary running apps: Apache 2.2.9, Tomcat5.5, Java version 1.6

Free -m
Code:

            total      used      free    shared    buffers    cached
Mem:          3288      3178        109          0        32        89
-/+ buffers/cache:      3057        230
Swap:        6212      3535      2676

top
Code:

top - 13:18:43 up 24 days, 22:07,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 269 total,  1 running, 268 sleeping,  0 stopped,  0 zombie
Cpu(s):  0.4%us,  0.2%sy,  0.0%ni, 99.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  3367052k total,  3262996k used,  104056k free,    33044k buffers
Swap:  6361700k total,  3620736k used,  2740964k free,    91572k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2587 root      15  -5    0    0    0 S    1  0.0  7:56.69 scsi_eh_1
 7726 root      20  0 1793m  42m 3740 S    1  1.3  6:44.90 java
19658 root      20  0  523m  66m 4344 S    1  2.0  6:32.25 java
    1 root      20  0  1880  560  504 S    0  0.0  0:55.17 init
    2 root      15  -5    0    0    0 S    0  0.0  0:00.00 kthreadd
    3 root      RT  -5    0    0    0 S    0  0.0  0:00.24 migration/0
    4 root      15  -5    0    0    0 S    0  0.0  0:41.67 ksoftirqd/0
    5 root      RT  -5    0    0    0 S    0  0.0  0:00.00 watchdog/0
    6 root      RT  -5    0    0    0 S    0  0.0  0:00.24 migration/1
    7 root      15  -5    0    0    0 S    0  0.0  0:39.70 ksoftirqd/1
    8 root      RT  -5    0    0    0 S    0  0.0  0:00.00 watchdog/1
    9 root      RT  -5    0    0    0 S    0  0.0  0:00.44 migration/2
  10 root      15  -5    0    0    0 S    0  0.0  0:39.74 ksoftirqd/2
  11 root      RT  -5    0    0    0 S    0  0.0  0:00.00 watchdog/2
  12 root      RT  -5    0    0    0 S    0  0.0  0:00.46 migration/3
  13 root      15  -5    0    0    0 S    0  0.0  0:39.56 ksoftirqd/3
  14 root      RT  -5    0    0    0 S    0  0.0  0:00.00 watchdog/3

vmstat 1 5

Code:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b  swpd  free  buff  cache  si  so    bi    bo  in  cs us sy id wa
 0  0 3608656 102696  37488  85944    1    1    3    4    5    4  0  0 99  0
 0  0 3608656 102688  37488  85968    0    0    0    0  861 1403  0  0 100  0
 0  0 3608656 102588  37488  85968    0    0    0    0  883 1332  0  0 99  0
 0  0 3608656 103348  37488  85968    0    0    0    0  766 1286  0  0 100  0
 0  0 3608656 103432  37492  85968    0    0    0    72  940 1428  0  0 100  0

I also ran a command I found elsewhere on this forum (ps -eo size,pid,user,cmd --sort -size | head -10) and it came back with a lot of data (too much to post) but my understanding is that the SZ value in ps is a rough figure and really only describes how much swap would be used if the entire process needed to be paged out at one time.

Tomcat runs in a number of private instances, so there's a separate startup script for each project. The default values were kept for the JVM heap sizes: Xms512M Xmx1536M

If I understand correctly, this means that when each Tomcat instance starts, it will grab 512MB of its allocated address space, and get up to 1.5GB or so more as the need arises. So at jump there are 512MB x however many instances I have installed sitting in virtual memory, but not mapped to physical memory. The mapping only begins when someone actually accesses one of the instances by using the associated webapp. Am I right so far?

So, I have very little memory left, I am swapping pretty hardcore, and even though I suspect it's the Tomcat/JVM stuff, it sure doesn't look like it from the memory tools. For that matter though it looks like "nothing" is using memory, or certainly not enough to cause such a low memory problem. The server was rebooted 24 days or so ago because it actually ran out of all virtual memory.

How do I solve this mystery? Am I using the wrong tools? Am I misunderstanding my tools? What can I do to track down the processes depleting my memory?

anomie 01-14-2011 02:27 PM

Sort your top(1) display by %MEM for a better answer to your question. After launching top:
  • x
  • b
  • Shift-. (or whatever '>' is on your keyboard)

(Note that 'h' displays help in top.)

Subject16 01-14-2011 03:09 PM

Ah, okay. See, I missed something. The results are:

Code:

Tasks: 278 total,  1 running, 277 sleeping,  0 stopped,  0 zombie
Cpu(s):  0.4%us,  0.2%sy,  0.0%ni, 99.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  3367052k total,  3271876k used,    95176k free,    44032k buffers
Swap:  6361700k total,  3599952k used,  2761748k free,    87584k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6397 me  20  0  2532 1120  784 R    1  0.0  0:00.04 top
 9041 root      20  0 1828m 220m 3896 S    1  6.7  25:07.37 java
 7726 root      20  0 1793m  42m 3740 S    0  1.3  6:46.58 java
16803 root      20  0  505m  36m 3812 S    0  1.1  6:08.56 java
19658 root      20  0  522m  66m 4344 S    0  2.0  6:35.13 java
23846 root      20  0 1799m 203m 3860 S    0  6.2  1:24.69 java
    1 root      20  0  1880  560  504 S    0  0.0  0:55.42 init

Everything below that init line is 0 across the board. So according to this (assuming I'm reading it correctly), 567MB or so of physical memory is in use by Java for my Tomcat instances. The largest of these looks to be a Java process that was spawned around the time of the last forced reboot. I guess I still don't understand how this translates into no memory on my system. I thought top showed all running processes and their associated resource utilization. This looks like out of all 278 services running, only 7 are using memory at all, yet I'm using almost all of it.

syg00 01-14-2011 06:46 PM

Maybe have a read of this thread.
And post your /proc/meminfo.

Nominal Animal 01-15-2011 12:32 AM

In some cases the page cache or the number of cached filesystem inodes or dentries may grow too large. When a previously idling process becomes active, and requires actual memory pages, the kernel must first evict dirty data to disk. To see if this is what happens for you, clear the caches and see what changes it makes to the memory use, and to the time it takes for a dormant Java activation to respond.

To flush all caches, run
Code:

sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
Leftover caches are in active use.

If you run into this often, consider reducing dirty_background_ratio and dirty_ratio, via e.g.
Code:

sudo sh -c 'echo 1 > /proc/sys/vm/dirty_background_ratio'
in a startup script. For further details, see the kernel vm documentation.
Nominal Animal

Subject16 01-18-2011 11:13 AM

I took a look at the linked thread (and another thread on memory management linked off of that one). So, based on the lessons from those posts (that top's reporting on individual process memory utilization is not completely accurate) it seems like I'm supposed to understand that in fact the java process memory usage is actually likely to be less than shown above because top is including shared libraries. This makes even less sense to me.

Here is /proc/meminfo:

Quote:

MemTotal: 3367052 kB
MemFree: 100156 kB
Buffers: 47944 kB
Cached: 85004 kB
SwapCached: 725416 kB
Active: 2671200 kB
Inactive: 382264 kB
HighTotal: 2485568 kB
HighFree: 4276 kB
LowTotal: 881484 kB
LowFree: 95880 kB
SwapTotal: 6361700 kB
SwapFree: 2524552 kB
Dirty: 460 kB
Writeback: 0 kB
AnonPages: 2488452 kB
Mapped: 21336 kB
Slab: 136736 kB
SReclaimable: 93924 kB
SUnreclaim: 42812 kB
PageTables: 15556 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 8045224 kB
Committed_AS: 16747900 kB
VmallocTotal: 110584 kB
VmallocUsed: 6700 kB
VmallocChunk: 103684 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 4096 kB
DirectMap4k: 8192 kB
DirectMap4M: 909312 kB
Here is the output from the Python script I copied from syg00:

Quote:

Private + Shared = RAM used Program

4.0 KiB + 9.5 KiB = 13.5 KiB logger
4.0 KiB + 10.5 KiB = 14.5 KiB klogd
4.0 KiB + 12.0 KiB = 16.0 KiB dd
12.0 KiB + 11.0 KiB = 23.0 KiB mysqld_safe
4.0 KiB + 20.5 KiB = 24.5 KiB udevd
4.0 KiB + 21.5 KiB = 25.5 KiB hald-addon-acpi
4.0 KiB + 23.0 KiB = 27.0 KiB xinetd
24.0 KiB + 13.0 KiB = 37.0 KiB atd
12.0 KiB + 38.0 KiB = 50.0 KiB acpid
4.0 KiB + 53.5 KiB = 57.5 KiB hald-addon-input
4.0 KiB + 62.0 KiB = 66.0 KiB bluetoothd
4.0 KiB + 68.5 KiB = 72.5 KiB wpa_supplicant
4.0 KiB + 69.0 KiB = 73.0 KiB vsftpd
4.0 KiB + 82.0 KiB = 86.0 KiB system-tools-backends
24.0 KiB + 63.0 KiB = 87.0 KiB getty (6)
96.0 KiB + 12.5 KiB = 108.5 KiB init
68.0 KiB + 52.5 KiB = 120.5 KiB hald-addon-storage
136.0 KiB + 19.0 KiB = 155.0 KiB syslogd
108.0 KiB + 52.0 KiB = 160.0 KiB cron
60.0 KiB + 112.5 KiB = 172.5 KiB cupsd
152.0 KiB + 45.5 KiB = 197.5 KiB hald-runner
248.0 KiB + 34.0 KiB = 282.0 KiB nscd
272.0 KiB + 42.0 KiB = 314.0 KiB ntpd
8.0 KiB + 337.0 KiB = 345.0 KiB gdm (2)
192.0 KiB + 177.5 KiB = 369.5 KiB nm-system-settings
224.0 KiB + 146.0 KiB = 370.0 KiB NetworkManager
308.0 KiB + 94.0 KiB = 402.0 KiB avahi-daemon (2)
116.0 KiB + 319.0 KiB = 435.0 KiB sh (29)
464.0 KiB + 34.5 KiB = 498.5 KiB nagios3
460.0 KiB + 69.0 KiB = 529.0 KiB sendmail-mta
536.0 KiB + 84.5 KiB = 620.5 KiB VVAgent
592.0 KiB + 48.0 KiB = 640.0 KiB mysqld
712.0 KiB + 21.5 KiB = 733.5 KiB dbus-daemon
808.0 KiB + 115.5 KiB = 923.5 KiB hald
716.0 KiB + 339.0 KiB = 1.0 MiB Xorg
116.0 KiB + 1.4 MiB = 1.6 MiB su (29)
1.1 MiB + 565.5 KiB = 1.6 MiB rotatelogs (29)
1.5 MiB + 219.5 KiB = 1.7 MiB console-kit-daemon
832.0 KiB + 1.0 MiB = 1.8 MiB sshd (3)
2.3 MiB + 38.0 KiB = 2.3 MiB bash
2.4 MiB + 462.5 KiB = 2.8 MiB gdmgreeter
5.1 MiB + 177.5 KiB = 5.3 MiB buagent
34.2 MiB + 10.9 MiB = 45.1 MiB apache2 (41)
2.3 GiB + 7.3 MiB = 2.3 GiB java (29)
---------------------------------
2.4 GiB
=================================

Private + Shared = RAM used Program

So, since this script takes into account the shared libraries, what I take from this is that Java is using the most actual physical memory (2.3GB). The rest of the reporting memory usage in top actually belongs to shared libraries. Does this seem right? So my memory issue lies pretty much with Java as I can't do anything about the memory being used by shared libraries?

Nominal Animal, I appreciate the suggestion to flush caches. I did some reading from a couple of other posts and forums (including http://www.scottklarr.com/topic/134/...e-from-memory/ and am a little nervous about performing this on a live system. Plus, isn't the cache in this case the same as the one reported in free -m? If so, I only have 89MB kept in cache, so it doesn't seem like I would gain much from this. Am I misunderstanding what this does?

Thanks!


All times are GMT -5. The time now is 07:34 PM.