LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   System running out of RAM (https://www.linuxquestions.org/questions/linux-newbie-8/system-running-out-of-ram-691586/)

hc_andy 12-18-2008 08:06 PM

System running out of RAM
 
Hi There,

New to everything *unix and have a box that we look after which is running RHEL 3 and PLESK 8.2 which we use for web hosting.

We're finding that the box is running out of memory every few days to the point where the OOM comes along and starts killing processes. I don't really know what's causing all the memory to be chewed up. I can only guess that it's because we have a little over 1000 web sites on this box and every time a httpd request comes in, it chews up a hell of a lot of memory (see "top" ouput below sorted by memory usage).

------------------------------------------------------------------------

12:57:21 up 13:25, 2 users, load average: 0.32, 0.38, 0.67
306 processes: 305 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 1.4% 0.0% 0.2% 0.0% 0.0% 0.1% 97.9%
cpu00 0.7% 0.0% 0.0% 0.0% 0.0% 0.3% 98.8%
cpu01 0.7% 0.0% 0.7% 0.0% 0.3% 0.0% 98.0%
cpu02 3.9% 0.0% 0.3% 0.0% 0.0% 0.3% 95.2%
cpu03 0.3% 0.0% 0.0% 0.0% 0.0% 0.0% 99.6%
Mem: 4112280k av, 3863828k used, 248452k free, 0k shrd, 34888k buff
2382008k actv, 441120k in_d, 73816k in_c
Swap: 522072k av, 127076k used, 394996k free 2053264k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
18395 apache 15 0 267M 267M 8232 S 0.0 6.6 1:02 2 httpd
24405 apache 25 0 263M 222M 8508 S 0.0 5.5 1:46 0 httpd
25316 apache 25 0 265M 190M 8484 S 0.0 4.7 1:10 0 httpd
28180 apache 16 0 97248 94M 8980 S 0.0 2.3 0:07 2 httpd
28179 apache 15 0 96388 93M 8892 S 0.0 2.3 0:40 3 httpd
28555 apache 16 0 96388 93M 8648 S 0.0 2.3 0:03 3 httpd
28575 apache 15 0 96092 93M 8720 S 0.0 2.3 0:05 1 httpd
28381 apache 15 0 96044 93M 8896 S 0.0 2.3 0:07 1 httpd
29574 apache 15 0 94816 91M 8532 S 0.0 2.2 0:01 2 httpd
27752 apache 15 0 93908 91M 8864 S 0.0 2.2 0:07 3 httpd
28176 apache 15 0 93856 91M 8940 S 0.0 2.2 0:37 2 httpd

------------------------------------------------------------------------

These httpd processes are all SLEEPING so what could they be doing. When I do a lsof -p <pid>, it didn't tell me what was going on except all the open files used by the pid.

We've upgraded from 2G of RAM to 4G of RAM but this hasn't helped and gradually the RAM gets chewed up again. We are also looking at migrating some web sites off this box to see if that helps.

Any suggestions on how we can stop the memory from being chewed up or how to identify any rogue processes eating up the memory or at least identify why httpd is consuming so much memory.

Thanks.

Andy

syg00 12-18-2008 08:25 PM

32 bit kernel ?. Let's see the messages when you start getting oom'd - especially any that mention (memory) region.

hc_andy 12-18-2008 08:57 PM

# uname -a
Linux plesk2.netspace.net.au 2.4.21-4.ELsmp #1 SMP Fri Oct 3 17:52:56 EDT 2003 i686 i686 i386 GNU/Linux

Here's a screen shot of the DRAC console showing OOM in action when we lost access to the box eventhough it was still able to be pinged.

http://myvws.netspace.net.au/oom.jpg

syg00 12-18-2008 09:52 PM

That's not a lot of use - have a look at your logs.

hc_andy 12-18-2008 10:36 PM

What I'm tyring to track down is why all of the memory keeps being consumed no matter how much RAM we throw at the box up to the point where the OOM kicks in and starts killing processes.

Top shows that the memory hoggers are the HTTPD and JAVA processes, but I would think that these process would release the memory once they're down doing their thing.

-----

Right now we are sitting with 52M of memory free.

[root@plesk2 myvws.netspace.net.au]# free -m
total used free shared buffers cached
Mem: 4015 3963 52 0 44 1759
-/+ buffers/cache: 2159 1856
Swap: 509 125 384

-----

A top output sorted by memory usage.

15:34:38 up 16:02, 2 users, load average: 0.77, 1.31, 1.27
303 processes: 302 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 4.5% 0.0% 1.0% 0.0% 0.1% 17.3% 76.7%
cpu00 4.0% 0.0% 0.0% 0.0% 0.7% 3.7% 91.4%
cpu01 1.1% 0.0% 0.3% 0.0% 0.0% 30.9% 67.5%
cpu02 12.6% 0.0% 3.3% 0.0% 0.0% 3.7% 80.2%
cpu03 0.3% 0.0% 0.3% 0.3% 0.0% 31.2% 67.6%
Mem: 4112280k av, 4071208k used, 41072k free, 0k shrd, 39476k buff
2831872k actv, 237172k in_d, 88788k in_c
Swap: 522072k av, 128792k used, 393280k free 1817924k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
18395 apache 15 0 261M 261M 2072 S 0.0 6.5 1:02 2 httpd
6315 apache 25 0 260M 259M 2196 S 0.0 6.4 0:35 0 httpd
24405 apache 25 0 257M 216M 2072 S 0.0 5.3 1:46 0 httpd
25316 apache 25 0 258M 184M 2072 S 0.0 4.5 1:10 0 httpd
6032 psaadm 15 0 163M 163M 21900 S 0.0 4.0 5:37 2 httpsd
5886 psaadm 16 0 161M 161M 21024 S 0.0 4.0 4:33 2 httpsd
8641 apache 15 0 96408 93M 6084 S 0.3 2.3 0:17 0 httpd
8647 apache 15 0 95592 92M 5960 S 0.0 2.3 0:13 1 httpd
9004 apache 17 0 95456 92M 5852 S 4.1 2.3 0:19 2 httpd
6132 apache 15 0 95184 92M 5740 S 0.0 2.3 0:21 0 httpd
8769 apache 15 0 94952 92M 5700 S 0.0 2.2 0:16 1 httpd
8973 apache 15 0 94800 92M 5704 S 0.0 2.2 0:18 3 httpd
10095 apache 15 0 94796 92M 5984 S 0.0 2.2 0:12 3 httpd
8629 apache 15 0 94728 92M 5672 S 0.0 2.2 0:22 1 httpd
8770 apache 15 0 94700 92M 5828 S 0.0 2.2 0:17 3 httpd
9011 apache 15 0 94600 91M 5856 S 0.0 2.2 0:19 2 httpd
9572 apache 15 0 94576 91M 5980 S 0.0 2.2 0:10 2 httpd
8628 apache 15 0 94508 91M 5532 S 0.0 2.2 0:46 0 httpd
8640 apache 15 0 94412 91M 5672 S 0.0 2.2 0:24 2 httpd
8648 apache 15 0 94140 91M 5192 S 0.0 2.2 0:13 3 httpd
10725 apache 15 0 93812 91M 5436 S 0.0 2.2 0:05 1 httpd
10776 apache 15 0 93252 90M 5672 S 0.0 2.2 0:17 3 httpd
10288 apache 15 0 92380 89M 5324 S 0.0 2.2 0:09 3 httpd
6133 apache 15 0 92584 89M 4864 S 0.0 2.2 0:13 1 httpd
11317 apache 15 0 92260 89M 5132 S 0.2 2.2 0:04 0 httpd
11467 apache 15 0 90876 88M 4872 S 0.0 2.1 0:01 1 httpd
11320 apache 15 0 89888 87M 5092 S 0.0 2.1 0:01 1 httpd
11321 apache 15 0 89848 87M 4832 S 0.0 2.1 0:01 1 httpd
23317 apache 15 0 86656 84M 2076 S 0.0 2.1 0:41 3 httpd
32420 root 16 0 75448 73M 2300 S 0.0 1.8 0:00 0 httpd
6045 root 15 0 75332 72M 2172 S 0.0 1.8 1:03 1 httpd
5657 tomcat4 25 0 71092 69M 9524 S 0.0 1.7 0:03 3 java
5669 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:02 0 java
5670 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 0 java
5671 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 3 java
5675 tomcat4 25 0 71092 69M 9524 S 0.0 1.7 0:00 2 java
5676 tomcat4 25 0 71092 69M 9524 S 0.0 1.7 0:00 2 java
5677 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:01 0 java
5678 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:32 1 java
5703 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:02 3 java
5704 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 0 java
5705 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 3 java
5706 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 2 java
5707 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 2 java
5708 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 2 java
5709 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 2 java
5710 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:02 1 java
5711 tomcat4 15 0 71092 69M 9524 S 0.0 1.7 0:00 2 java

syg00 12-19-2008 03:10 AM

Quote:

Originally Posted by hc_andy (Post 3380722)
Right now we are sitting with 52M of memory free.

[root@plesk2 myvws.netspace.net.au]# free -m
total used free shared buffers cached
Mem: 4015 3963 52 0 44 1759
-/+ buffers/cache: 2159 1856
Swap: 509 125 384

Wrong - you have 1856 (efectively) free. But that is possibly irrelevant.
You were asked to peruse the logs - the top listing is pointless. If you expect people to try to help, provide the info requested.

AnanthaP 12-19-2008 03:28 AM

Most processes getting "killed" (as per the screen shot) seem to be the mysql daemon.

So, it may be that mysql isn't scavenging memory. So could it be some mysql setting?

End

hc_andy 12-21-2008 05:12 PM

Thanks for the reply guys.

I will keep an eye on the free -m output. Read up a bit on it and what syg00 says makes sense now.

lazlow 12-21-2008 05:40 PM

Since you are running RHEL and using RHEL requires a support license, why not ask RH what is going on?


All times are GMT -5. The time now is 09:57 AM.