LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Red Hat
User Name
Password
Red Hat This forum is for the discussion of Red Hat Linux.

Notices



Reply
 
Search this Thread
Old 03-14-2013, 03:46 AM   #1
dann_radkov
Member
 
Registered: Sep 2011
Posts: 52

Rep: Reputation: Disabled
Diagnosing RAM/virtual memory & "cached" usage under RHEL 5.3


Hi Folks,
I have an app running on my host which causes the following statistics:

Code:
[root@host~]# vmstat -n 1 -S M
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0      1   7262   2372  43261    0    0    39    32    0    0  6  1 91  2  0
 0  0      1   7263   2372  43262    0    0     0     0 4505 10440  7  1 92  0  0
 5  1      1   7263   2372  43262    0    0     0   456 4819 10408  8  1 88  2  0
 0  0      1   7260   2372  43262    0    0     0   628 3951 11138 14  2 81  2  0
 1  1      1   7260   2372  43262    0    0     0   740 3989 10665 11  1 84  5  0
 5  1      1   7260   2372  43262    0    0     0   372 3729 10569  8  3 87  2  0
 2  1      1   7260   2372  43262    0    0     0   512 4900 20677 10  2 85  3  0
 2  0      1   7261   2372  43262    0    0     0   320 4427 13693  9  1 88  2  0

Code:
[root@host ~]# free -g
             total       used       free     shared    buffers     cached
Mem:            62         55          7          0          2         42
-/+ buffers/cache:         11         51
Swap:            3          0          3

[root@host ~]# swapon -s
Filename                                Type            Size    Used    Priority
/dev/mapper/VolGroup00-LogVol01         partition       4194296 1176    -1

Basically the app is responsible for creating all the "cached" sections, which if i remember correctly is the data yet to be read from the disk.This causes an awful degradation of the app (java) and if the app is restarted the cached components clear out and everything works just smoothly.My question is:
Can I trace which particular process/activity keeps dumping information in the cached section?
Provided that the host has so much RAM why does it still have 1 under the "swpd/virtual memory used?
The Top output ( when sorted by memory use )looks like this:

Code:
[root@host~]# top
top - 07:39:39 up 357 days, 15:45,  1 user,  load average: 4.14, 3.22, 2.87
Tasks: 427 total,   1 running, 426 sleeping,   0 stopped,   0 zombie
Cpu(s):  7.3%us,  1.6%sy,  0.0%ni, 89.9%id,  0.6%wa,  0.0%hi,  0.6%si,  0.0%st
Mem:  66006556k total, 59347276k used,  6659280k free,  2429344k buffers
Swap:  4194296k total,     1176k used,  4193120k free, 44317580k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
23218 twist     21   0 9311m 2.6g  39m S  0.0  4.2   5:49.53 java
 6195 root      15   0 1703m 1.3g 3520 S 11.6  2.0   4192:47 python2
22075 occ       25   0 1932m 974m  15m S  0.0  1.5  59:29.85 java
 5792 root      15   0  780m 624m 4168 S  0.0  1.0  54:27.16 snmpd
 6369 root      25   0 1592m 576m 8960 S  7.7  0.9  51:40.16 java
27012 root      19   0 2046m 477m 9988 S  0.0  0.7  25:34.59 java
 7210 root      19   0 2068m 434m 9908 S  0.0  0.7  23:07.25 java
24850 root      17   0 3070m 301m  13m S  0.0  0.5 115:23.21 java
22425 root      25   0 1130m 294m  13m S  1.9  0.5   1902:32 java
28692 root      25   0 1884m 258m 9944 S  0.0  0.4 178:11.01 java
28245 root      25   0  731m 128m 9400 S  0.0  0.2  16:44.44 java
19191 root      15   0  377m 119m  12m S 61.7  0.2   8:00.00 python2
32450 appgw    15   0  558m  89m 2504 S 42.4  0.1   9323:08 appgw
  694 appgw    18   0  510m  70m 2660 S 21.2  0.1 131:04.45 appgw
11676 appgw    18   0  459m  52m 2448 S  0.0  0.1   3:42.37 appgw
 4153 root      15   0  288m  52m 3280 S  0.0  0.1  46:47.48 python2
32484 nobody    15   0 91896  26m 1984 S  0.0  0.0  33:22.88 httpd
17716 nobody    15   0 91792  26m 1980 S  0.0  0.0  33:24.11 httpd
31974 nobody    15   0 90664  25m 1984 S  0.0  0.0  32:03.21 httpd
31978 nobody    15   0 90544  25m 1976 S  0.0  0.0  32:03.27 httpd
31981 nobody    15   0 90536  25m 1980 S  0.0  0.0  31:55.35 httpd
31984 nobody    15   0 90596  25m 1984 S  0.0  0.0  32:21.13 httpd
 2353 nobody    15   0 90592  25m 1972 S  0.0  0.0  32:41.79 httpd
31951 nobody    15   0 90500  25m 1972 S  0.0  0.0  31:34.82 httpd
 7274 root      15   0  167m  23m  14m S  0.0  0.0 141:28.92 coda
 7362 root      15   0  223m  18m 2668 S  0.0  0.0   1:50.31 python
 7276 root      15   0  113m  16m 6084 S  0.0  0.0  24:03.83 opcle
18663 nobody    15   0 80452  15m 1976 S  0.0  0.0  17:01.83 httpd
 5301 nobody    15   0 80540  15m 1972 S  0.0  0.0  18:22.64 httpd
18658 nobody    15   0 80456  15m 1972 S  0.0  0.0  16:57.19 httpd
18605 nobody    15   0 80460  15m 1840 S  0.0  0.0  17:07.75 httpd
 9408 root      34  19  250m  14m 1932 S  0.0  0.0   0:00.13 yum-updatesd
17190 nobody    15   0 79672  14m 1972 S  0.0  0.0  15:59.54 httpd
17203 nobody    15   0 79636  14m 1976 S  0.0  0.0  15:49.06 httpd
17225 nobody    15   0 79624  14m 1976 S  0.0  0.0  15:50.63 httpd
17201 nobody    15   0 79708  14m 1972 S  0.0  0.0  16:03.97 httpd
17221 nobody    15   0 79628  14m 1976 S  0.0  0.0  15:49.53 httpd
21277 nobody    15   0 79548  14m 1960 S  0.0  0.0  15:41.97 httpd
16253 nobody    15   0 78076  13m 1964 S  0.0  0.0  14:19.99 httpd
16634 nobody    15   0 78136  12m 1836 S  0.0  0.0  13:43.96 httpd
 6968 root      15   0  268m  12m 5964 S  0.0  0.0   4:33.30 ovcd
 7285 root      15   0  121m  12m 7916 S  0.0  0.0   3:51.53 opcmsga
25395 root      19   0  348m  11m 2476 S  0.0  0.0 120:07.01 python
 3748 root      16   0  135m  11m 3756 S  0.0  0.0   0:30.74 uwsgi
 5090 root      15   0  135m  11m 3756 S  0.0  0.0   0:30.59 uwsgi
28213 root      15   0  135m  11m 3756 S  0.0  0.0   0:30.66 uwsgi
 9275 root      15   0  135m  11m 3756 S  0.0  0.0   0:30.52 uwsgi
11545 root      15   0  135m  11m 3756 S  0.0  0.0   0:29.33 uwsgi
26913 root      15   0  135m  11m 3756 S  0.0  0.0   0:31.10 uwsgi
28288 root      16   0  135m  11m 3756 S  0.0  0.0   0:30.82 uwsgi
27150 root      16   0  135m  11m 3756 S  0.0  0.0   0:29.40 uwsgi
 6997 root      15   0  118m  10m 6876 S  0.0  0.0   0:25.34 ovconfd
 8712 root       0 -20 35972  10m 2052 S  0.0  0.0  11500:20 scopeux
 6977 root      15   0  164m   9m 5288 S  0.0  0.0   3:58.19 ovbbccb
 7342 root      15   0 46468 9144 6140 S  0.0  0.0   0:36.86 opctrapi
 7346 root      15   0  115m 9008 7200 S  0.0  0.0  12:34.53 opcacta
12597 nobody    15   0 73728 8968 1976 S  0.0  0.0   7:04.91 httpd
13869 nobody    15   0 73652 8936 1976 S  0.0  0.0   7:01.61 httpd
17198 nobody    15   0 73596 8936 1976 S  0.0  0.0   7:02.47 httpd
14697 nobody    15   0 73664 8928 1972 S  0.0  0.0   7:03.29 httpd
28772 nobody    15   0 73632 8928 1976 S  0.0  0.0   7:02.46 httpd
10236 nobody    15   0 73592 8916 1972 S  0.0  0.0   7:02.07 httpd
Details of the host itself:
Code:
[root@host~]# cat /proc/meminfo | grep -i memtotal
MemTotal:     66006556 kB
[root@host ~]# uname -r
2.6.18-128.el5
[root@host ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.3 (Tikanga)

Last edited by onebuck; 03-14-2013 at 11:04 AM. Reason: clean thread by using vbcode tag
 
Old 03-14-2013, 08:58 PM   #2
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,139

Rep: Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127
Quote:
Originally Posted by dann_radkov View Post
Basically the app is responsible for creating all the "cached" sections,
That is plausible.

Quote:
which if i remember correctly is the data yet to be read from the disk.
That is either very badly stated or just wrong.

Quote:
This causes an awful degradation of the app (java)
I don't believe that is correct. The high use of cache does not cause a degradation of the app.

Quote:
if the app is restarted the cached components clear out and everything works just smoothly.
Restarting an application (in contrast to rebooting the OS) usually does not clear the cache of things put into the cache by that application.

If what you seem to be saying is correct, that would be an important clue to some unusual behavior of the app. It is possible the high cache use and the degradation are both symptoms of the same cause. But I don't see enough info to determine the real cause.

One way to create such a set of symptoms would be to create a series of large temp files and open each, then delete it without closing it. This is just an example, not really a guess at what your app might have done.

But you can look at the set of files your app has open. That could be a clue for a wider range of possible problems than the simple example I just described.

Quote:
Provided that the host has so much RAM why does it still have 1 under the "swpd/virtual memory used?
That is normal. There is some stale anonymous data. That is correctly moved to swap as soon as there is enough data that could be cached.

Last edited by johnsfine; 03-15-2013 at 08:17 AM.
 
Old 03-15-2013, 08:59 AM   #3
dann_radkov
Member
 
Registered: Sep 2011
Posts: 52

Original Poster
Rep: Reputation: Disabled
Thanks for the help!
Maybe i have some gaps here'n'there:

Quote:
which if i remember correctly is the data yet to be read from the disk.
That is either very badly stated or just wrong.
What would be the more appropriate description then?

Quote:
if the app is restarted the cached components clear out and everything works smoothly.
I should have stated : "If the app is stopped", the memory usage drops and I believe the cached data goes away.
 
Old 03-15-2013, 09:18 AM   #4
johnsfine
Guru
 
Registered: Dec 2007
Distribution: Centos
Posts: 5,139

Rep: Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127Reputation: 1127
Quote:
Originally Posted by dann_radkov View Post
What would be the more appropriate description then?
It is data that has been read from or written to disk. An extra copy is kept in ram (as long as there is no better use for that ram) in case the same data will be read again (in which case the ram copy is used instead of rereading disk).

Quote:
I should have stated : "If the app is stopped", the memory usage drops and I believe the cached data goes away.
I assumed that aspect of what you meant, that it is stopping the app (not restarting it) that you were saying releases the large use of cache memory.

That is what I meant was unusual. Usually data remains in cache after the app that last used the data is gone.
Removing data from cache, when that ram isn't needed for immediate reuse, should imply the disk copy of the same data has also been made unavailable, such as by dismounting the partition or deleting the file, etc.

That is why I suggested some use (probably filtered) of the lsof command to see which (how many, how big) files your app has open. Files that get deleted when an app exits are typically open while that app is running. The app might do explicit cleanup, including non open files, when it is told to exit. So I can't be certain the files would be open. I can't even be sure that files deleted on exit are the explanation of the symptom. But those are my best guesses.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Correlation between "free -m" shared line & "/proc/meminfo" shmem RHEL 6.2 - Anyone? xenner Linux - General 3 11-15-2012 06:36 PM
LXer: Use "free" to monitor memory usage continuously LXer Syndicated Linux News 0 07-26-2012 07:23 PM
RAM usage; is Debian "better"? jsteel Linux - General 12 03-10-2011 05:20 PM
"free" shows far more memory usage than summing up application usage kenneho Linux - Server 2 08-06-2010 08:56 AM
difference between "Memory" & "Good Memory" in AIX nagendrar AIX 2 12-18-2009 08:05 PM


All times are GMT -5. The time now is 09:08 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration