Linux - Server This forum is for the discussion of Linux Software used in a server related context. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
03-30-2011, 12:25 PM
|
#1
|
LQ Newbie
Registered: Mar 2011
Posts: 3
Rep:
|
System grinding to a halt (RHEL5 - Apache/Mysql/PHP/suPHP)
Hello all,
I have a system which is continually becoming unusable. After a time (could be 12 hours, could be a week) the system is unresponsive (pages won't load, can't connect via SSH, can't login at the console). Doing a full reboot of the system fixes the issue for a time.
The system is shared hosting for students, faculty, and staff at a college. I use suPHP to make each user's pages run as their own user (instead of making everything executable by apache).
The system runs Apache 2.2.3, PHP 5.3.5, and MySQL 5.1.55. The non-static pages on the system are mostly Wordpress/Drupal sites.
I'm not sure how to narrow down and identify the root cause and not the symptoms. I'm including some log excepts below (/var/log/message) when the problem starts to occur. Unfortunately I don't know if these are indicative of the cause or a secondary effect.
Any advice that can be given would be much appreciated.
from /var/log/messages: Mar 30 05:15:48 shell kernel: httpd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Mar 30 05:15:48 shell kernel:
Mar 30 05:15:48 shell kernel: Call Trace:
Mar 30 05:15:48 shell kernel: [<ffffffff800c82e8>] out_of_memory+0x8e/0x2f3
Mar 30 05:15:48 shell kernel: [<ffffffff8000f506>] __alloc_pages+0x27f/0x308
Mar 30 05:15:48 shell kernel: [<ffffffff80012ed7>] __do_page_cache_readahead+0x96/0x179
Mar 30 05:15:48 shell kernel: [<ffffffff8001386c>] filemap_nopage+0x14c/0x360
Mar 30 05:15:48 shell kernel: [<ffffffff8000898c>] __handle_mm_fault+0x1fa/0xfaa
Mar 30 05:15:48 shell kernel: [<ffffffff80067b55>] do_page_fault+0x4cb/0x874
Mar 30 05:15:48 shell kernel: [<ffffffff8005bf02>] del_timer_sync+0xc/0x16
Mar 30 05:15:48 shell kernel: [<ffffffff80098e91>] process_timeout+0x0/0x5
Mar 30 05:15:48 shell kernel: [<ffffffff800f921b>] sys_epoll_wait+0x3b8/0x3f9
Mar 30 05:15:48 shell kernel: [<ffffffff8005ede9>] error_exit+0x0/0x84
Mar 30 05:15:56 shell kernel:
Mar 30 05:15:56 shell kernel: Mem-info:
Mar 30 05:15:56 shell kernel: Node 0 DMA per-cpu:
Mar 30 05:15:56 shell kernel: cpu 0 hot: high 0, batch 1 used:0
Mar 30 05:15:56 shell kernel: cpu 0 cold: high 0, batch 1 used:0
Mar 30 05:15:56 shell kernel: Node 0 DMA32 per-cpu:
Mar 30 05:15:56 shell kernel: cpu 0 hot: high 186, batch 31 used:59
Mar 30 05:15:56 shell kernel: cpu 0 cold: high 62, batch 15 used:14
Mar 30 05:15:56 shell kernel: Node 0 Normal per-cpu: empty
Mar 30 05:15:56 shell kernel: Node 0 HighMem per-cpu: empty
Mar 30 05:15:56 shell kernel: Free pages: 7616kB (0kB HighMem)
Mar 30 05:15:56 shell kernel: Active:102134 inactive:99404 dirty:1 writeback:0 unstable:0 free:1904 slab:9588 mapped-file:985 mapped-anon:201290 pagetables:
Mar 30 05:15:57 shell kernel: Node 0 DMA free:1936kB min:28kB low:32kB high:40kB active:0kB inactive:0kB present:10508kB pages_scanned:0 all_unreclaimable?
Mar 30 05:15:57 shell kernel: lowmem_reserve[]: 0 2004 2004 2004
Mar 30 05:15:57 shell kernel: Node 0 DMA32 free:5680kB min:5712kB low:7140kB high:8568kB active:408536kB inactive:397616kB present:2052256kB pages_scanned:1
Mar 30 05:15:57 shell kernel: lowmem_reserve[]: 0 0 0 0
Mar 30 05:15:57 shell kernel: Node 0 Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 30 05:15:57 shell kernel: lowmem_reserve[]: 0 0 0 0
Mar 30 05:15:57 shell kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
Mar 30 05:15:57 shell kernel: lowmem_reserve[]: 0 0 0 0
Mar 30 05:15:57 shell kernel: Node 0 DMA: 4*4kB 4*8kB 4*16kB 3*32kB 5*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1936kB
Mar 30 05:15:57 shell kernel: Node 0 DMA32: 20*4kB 6*8kB 1*16kB 17*32kB 6*64kB 2*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 5680kB
Mar 30 05:15:57 shell kernel: Node 0 Normal: empty
Mar 30 05:15:57 shell kernel: Node 0 HighMem: empty
Mar 30 05:15:57 shell kernel: 1444 pagecache pages
Mar 30 05:15:57 shell kernel: Swap cache: add 2690440, delete 2690147, find 1723996/1893421, race 11+253
Mar 30 05:15:57 shell kernel: Free swap = 0kB
Mar 30 05:15:57 shell kernel: Total swap = 2031608kB
Mar 30 05:15:57 shell kernel: Free swap: 0kB
Mar 30 05:15:57 shell kernel: 524288 pages of RAM
Mar 30 05:15:57 shell kernel: 9473 reserved pages
Mar 30 05:15:57 shell kernel: 98209 pages shared
Mar 30 05:15:57 shell kernel: 293 pages swap cached
Mar 30 05:15:57 shell kernel: Out of memory: Killed process 11400, UID 48, (httpd).
Mar 30 05:21:55 shell kernel: gdm-rh-security invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Mar 30 05:21:55 shell kernel:
Mar 30 05:21:55 shell kernel: Call Trace:
Mar 30 05:21:55 shell kernel: [<ffffffff800c82e8>] out_of_memory+0x8e/0x2f3
Mar 30 05:21:55 shell kernel: [<ffffffff8000f506>] __alloc_pages+0x27f/0x308
Mar 30 05:21:55 shell kernel: [<ffffffff80012ed7>] __do_page_cache_readahead+0x96/0x179
Mar 30 05:21:55 shell kernel: [<ffffffff8001386c>] filemap_nopage+0x14c/0x360
Mar 30 05:21:55 shell kernel: [<ffffffff8000898c>] __handle_mm_fault+0x1fa/0xfaa
Mar 30 05:21:55 shell kernel: [<ffffffff80067b55>] do_page_fault+0x4cb/0x874
Mar 30 05:21:55 shell kernel: [<ffffffff8005ede9>] error_exit+0x0/0x84
Mar 30 05:21:55 shell kernel:
Mar 30 05:21:55 shell kernel: Mem-info:
Mar 30 05:21:55 shell kernel: Node 0 DMA per-cpu:
Mar 30 05:21:55 shell kernel: cpu 0 hot: high 0, batch 1 used:0
Mar 30 05:21:56 shell kernel: cpu 0 cold: high 0, batch 1 used:0
Mar 30 05:21:56 shell kernel: Node 0 DMA32 per-cpu:
Mar 30 05:21:56 shell kernel: cpu 0 hot: high 186, batch 31 used:29
Mar 30 05:21:56 shell kernel: cpu 0 cold: high 62, batch 15 used:14
Mar 30 05:21:56 shell kernel: Node 0 Normal per-cpu: empty
Mar 30 05:21:56 shell kernel: Node 0 HighMem per-cpu: empty
Mar 30 05:21:56 shell kernel: Free pages: 7632kB (0kB HighMem)
Mar 30 05:21:56 shell kernel: Active:102061 inactive:101948 dirty:0 writeback:0 unstable:0 free:1908 slab:9500 mapped-file:977 mapped-anon:203213 pagetables
Mar 30 05:21:56 shell kernel: Node 0 DMA free:1936kB min:28kB low:32kB high:40kB active:0kB inactive:0kB present:10508kB pages_scanned:0 all_unreclaimable?
Mar 30 05:21:56 shell kernel: lowmem_reserve[]: 0 2004 2004 2004
Mar 30 05:21:56 shell kernel: Node 0 DMA32 free:5696kB min:5712kB low:7140kB high:8568kB active:408244kB inactive:407792kB present:2052256kB pages_scanned:1
Mar 30 05:21:56 shell kernel: lowmem_reserve[]: 0 0 0 0
Mar 30 05:21:56 shell kernel: Node 0 Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Mar 30 05:21:56 shell kernel: lowmem_reserve[]: 0 0 0 0
Mar 30 05:21:56 shell kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable?
Mar 30 05:21:56 shell kernel: lowmem_reserve[]: 0 0 0 0
Mar 30 05:21:56 shell kernel: Node 0 DMA: 4*4kB 4*8kB 4*16kB 3*32kB 5*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1936kB
Mar 30 05:21:56 shell kernel: Node 0 DMA32: 16*4kB 0*8kB 22*16kB 9*32kB 6*64kB 2*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 5696kB
Mar 30 05:21:56 shell kernel: Node 0 Normal: empty
Mar 30 05:21:56 shell kernel: Node 0 HighMem: empty
Mar 30 05:21:57 shell kernel: 1838 pagecache pages
Mar 30 05:21:57 shell kernel: Swap cache: add 2888867, delete 2888587, find 1743340/1935804, race 12+350
Mar 30 05:21:57 shell kernel: Free swap = 0kB
Mar 30 05:21:57 shell kernel: Total swap = 2031608kB
Mar 30 05:21:57 shell kernel: Free swap: 0kB
Mar 30 05:21:57 shell kernel: 524288 pages of RAM
Mar 30 05:21:57 shell kernel: 9473 reserved pages
Mar 30 05:21:57 shell kernel: 91074 pages shared
Mar 30 05:21:57 shell kernel: 280 pages swap cached
Mar 30 05:21:57 shell kernel: Out of memory: Killed process 11402, UID 48, (httpd).
Mar 30 05:48:29 shell kernel: INFO: task httpd:17918 blocked for more than 120 seconds.
Mar 30 05:48:29 shell kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 30 05:48:29 shell kernel: httpd D 0000000000000002 0 17918 3408 17919 17916 (NOTLB)
Mar 30 05:48:29 shell kernel: ffff81005d52bc88 0000000000000086 ffff81000a9a77e0 ffffffff8008d957
Mar 30 05:48:29 shell kernel: ffff81000a9a77e0 0000000000000007 ffff81000a9a77e0 ffff8100625ef820
Mar 30 05:48:29 shell kernel: 00003a1548add156 000000000000257b ffff81000a9a79c8 00000000680afbc0
Mar 30 05:48:29 shell kernel: Call Trace:
Mar 30 05:48:29 shell kernel: [<ffffffff8008d957>] dequeue_task+0x18/0x37
Mar 30 05:48:29 shell kernel: [<ffffffff8006f1f5>] do_gettimeofday+0x40/0x90
Mar 30 05:48:29 shell kernel: [<ffffffff8005ad5a>] getnstimeofday+0x10/0x28
Mar 30 05:48:29 shell kernel: [<ffffffff80028adc>] sync_page+0x0/0x43
Mar 30 05:48:29 shell kernel: [<ffffffff800647ea>] io_schedule+0x3f/0x67
Mar 30 05:48:29 shell kernel: [<ffffffff80028b1a>] sync_page+0x3e/0x43
Mar 30 05:48:29 shell kernel: [<ffffffff8006492e>] __wait_on_bit_lock+0x36/0x66
Mar 30 05:48:29 shell kernel: [<ffffffff8003ff92>] __lock_page+0x5e/0x64
Mar 30 05:48:30 shell kernel: [<ffffffff800a1bd2>] wake_bit_function+0x0/0x23
Mar 30 05:48:30 shell kernel: [<ffffffff80013988>] filemap_nopage+0x268/0x360
Mar 30 05:48:30 shell kernel: [<ffffffff8000898c>] __handle_mm_fault+0x1fa/0xfaa
Mar 30 05:48:30 shell kernel: [<ffffffff80067b55>] do_page_fault+0x4cb/0x874
Mar 30 05:48:30 shell kernel: [<ffffffff800a1ba4>] autoremove_wake_function+0x0/0x2e
Mar 30 05:48:30 shell kernel: [<ffffffff8005ede9>] error_exit+0x0/0x84
|
|
|
03-30-2011, 12:26 PM
|
#2
|
Senior Member
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278
|
You are running out of memory.
Quote:
gdm-rh-security invoked oom-killer
|
|
|
|
03-30-2011, 12:37 PM
|
#3
|
LQ Newbie
Registered: Mar 2011
Posts: 3
Original Poster
Rep:
|
Sorry if I didn't make it clear. I understand that I'm running out of memory. My problem is that I'm trying to find out what is taking so much memory. Once the system becomes unresponsive I can't even run top or vmstat to see what is going on.
Aside from that, is there a way to prevent apache/php processes from taking up too much memory? I've thrown more memory at the system and this is FAR from our busiest system. I have systems with a quarter of the memory with four times the load. All giving more memory has done is delay the impact slightly (hours or minutes, not days).
Some process/thread is going out of control and consuming all memory that is available to it.
|
|
|
03-30-2011, 01:55 PM
|
#4
|
LQ Veteran
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,286
|
Given RHEL 5 you should have sysstat (i.e. sar) installed and running, so you should have plenty of historical data for over-all consumption. Mark will be along soon to recommend his little baby collectl.
Install that and run it in daemon mode to get historical data per process that you can query later. Else just knock up a quick scrip to run ps at intervals - handy because you can select what (columns) you want to see, and sort it appropriately. Quick and easy.
Simply save to a file for later analysis if you lose the machine again.
Last edited by syg00; 03-30-2011 at 01:57 PM.
Reason: last sentence
|
|
1 members found this post helpful.
|
03-30-2011, 02:59 PM
|
#5
|
LQ Newbie
Registered: Mar 2011
Posts: 3
Original Poster
Rep:
|
I've used SAR before (its great for I/O monitoring) but I wasn't aware it could give info on a per process basis. I will definitely have to look into that in more detail.
The collectl package you mentioned definitely sounds VERY useful. I'm giving that a try now and will use it to keep some process information (collectl -sZ). If I find anything in particular that is causing an issue, I'll report back here in case the issue has/will occur for others.
Thanks much for the suggestion syg00.
|
|
|
03-30-2011, 09:31 PM
|
#6
|
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,415
|
You could also look at ulimit, but ultimately you need to find the memory leak. Check the PHP code.
|
|
|
All times are GMT -5. The time now is 02:31 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|