LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
LinkBack Search this Thread
Old 05-18-2012, 04:30 PM   #1
transient
LQ Newbie
 
Registered: Aug 2011
Posts: 17

Rep: Reputation: Disabled
Can you help decipher Apache2 server-status output? Are KeepAlives Misconfigured?


Hi all-

I have Apache 2.2.14 running on Ubuntu 10.04. The server has been running for almost a year without incident. The server is a front-end web server which passes requests for dynamic content (Java) to Tomcat on other servers using mod_jk. April 24th the web server started responding slugglishly to client requests (as reported by clients) before ultimately becoming unresponsive. It had to be hard rebooted to bring it back up. I found no evidence of what caused the crash after the server came back up, but I had noticed that there were a lot of Apache processes running (~146) right before everything crapped the bed. The MaxClients directive is set to the default of 150 and this has been sufficient all this time. I changed it to 200 and immediately saw that there was a jump in child process numbers almost immediately, like it couldn't wait to launch more child processes. I didn't see anything in the error log about going over the MaxClient limit so I'm not sure if I'm chasing squirrels here. In a nutshell, what I'm looking for is help is figuring out if MaxClients/KeepAlives are potential performance problems in my config, or whether I might be looking at some outside influence (compromised system).

The server is running mod_php so it's using the prefork MPM. Here are the relevant lines from apache2.conf:

Code:
<IfModule mpm_prefork_module>
    StartServers          5
    MinSpareServers       5
    MaxSpareServers      10
    MaxClients          200
    MaxRequestsPerChild   0
</IfModule>


KeepAlive On
MaxKeepAliveRequests 100

KeepAliveTimeout 15
I enabled the server-status page and briefly turned ExtendedStatus On (didn't want to leave it on if it could decrease performance). Here's what I saw:

Code:
KWKKKKKKKKKKKKKKKKKKKKKKKKKCKKKKKKWKCKKKKKKKKCKKKWSSSSS.........
.....K...................K.....W.......K........................
..K.....K....K..........W...G...KK...................W.......... 
.C.....K........................................................
I had a lot of KeepAlives that I recognized, like this:

Code:
3-3 886 6/6/6 K 0.01 3 1 7.1 0.01 0.01 9.9.9.9 clientname.com POST /RequestHandler HTTP/1.1
That seems like a lot of open connections not necessarily doing anything. I assume that for every one of these there are corresponding child processes hanging around. Should I decrease the KeepAlive Timeout number?

The other bit is that along with the above lines in server-status I am seeing things like:

Code:
136-2 747 0/0/0 K 0.00 1335452011 0 0.0 0.00 0.00    
141-2 752 0/0/0 K 0.00 1335452011 0 0.0 0.00 0.00    
152-2 764 0/0/0 W 0.00 1335452011 0 0.0 0.00 0.00    
156-2 768 0/0/0 G 0.00 1335452011 0 0.0 0.00 0.00    
160-2 772 0/0/0 K 0.00 1335452011 0 0.0 0.00 0.00    
161-2 773 0/0/0 K 0.00 1335452011 0 0.0 0.00 0.00    
181-2 793 0/0/0 W 0.00 1335452011 0 0.0 0.00 0.00    
193-2 805 0/0/0 C 0.00 1335452011 0 0.0 0.00 0.00    
199-2 811 0/0/0 K 0.00 1335452011 0 0.0 0.00 0.00
If I understand the scoreboard key correctly, this means that there are child processes that have been in either a KeepAlive (mostly), closing, or replying state that have no connection on the other end (since there are no IP addresses and all other stats are 0). It's essentially like they're stuck or hanging. Is this an accurate understanding? I wouldn't think this is a by-product of too high of a KeepAlive value, which is what makes me wonder if this is something else malicious, or a memory leak or something like that.

I also converted the seconds in the SS column to get an idea of how long they've been that way, and it comes out to something like 15,000 days! That's not right obviously. Am I doing something wrong in that calculation, or misunderstanding what it really means?

Thanks in advance,
SC
 
Old 05-19-2012, 10:31 AM   #2
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 2,866

Rep: Reputation: 698Reputation: 698Reputation: 698Reputation: 698Reputation: 698Reputation: 698
My configuration looks like this (these are default values):
Code:
#
# Timeout: The number of seconds before receives and sends time out.
#
Timeout 300

#
# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On

#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100

#
# KeepAliveTimeout: Number of seconds to wait for the next request from the
# same client on the same connection.
#
KeepAliveTimeout 5
And
Code:
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_prefork_module>
    StartServers          5
    MinSpareServers       5
    MaxSpareServers      10
    MaxClients          150
    MaxRequestsPerChild   0
</IfModule>
The system has only been up for 42 days (usually a lot longer); this is what ps shows:
Code:
root      2215     1  0 Apr06 ?        00:01:01 /usr/sbin/httpd -k start
apache    2244  2215  0 Apr06 ?        00:00:03 /usr/sbin/httpd -k start
apache    2571  2215  0 Apr06 ?        00:00:04 /usr/sbin/httpd -k start
apache   25830  2215  0 Apr18 ?        00:00:21 /usr/sbin/httpd -k start
apache   27192  2215  0 Apr18 ?        00:00:01 /usr/sbin/httpd -k start
apache   27222  2215  0 Apr18 ?        00:00:02 /usr/sbin/httpd -k start
apache   30932  2215  0 Apr19 ?        00:00:03 /usr/sbin/httpd -k start
apache   30937  2215  0 Apr19 ?        00:00:02 /usr/sbin/httpd -k start
apache   30962  2215  0 Apr19 ?        00:00:01 /usr/sbin/httpd -k start
apache   31133  2215  0 Apr19 ?        00:00:00 /usr/sbin/httpd -k start
apache   31138  2215  0 Apr19 ?        00:00:00 /usr/sbin/httpd -k start
The system was rebooted 06 April.

BTW, there are 86,400 seconds in a day (60 * 60 * 24) if that helps.

There have been a couple of updates to httpd over the past few months; httpd should be at least httpd-2.2.22 and PHP at php-5.3.10 (if you're using PHP). If you are not up to that/those levels, you might want to upgrade.

Do you have ntop installed? Have you watched it? You might be getting whacked from China, Korea or some other lawless land; have a look-see in syslog for break-in attempts. If you're interested, go take a look at http://www.ntop.org; ntop.

Enabling the server status doesn't impact performance.

Your configuration looks all right (well, what you showed anyway). You might want to take it back to the defaults (which work just fine in almost every case). Might be worth your time to upgrade httpd and, perhaps, install ntop to see what's what. Also wouldn't hurt to take a look at Tomcat settings (don't know much about Tomcat) and at your web page source, perhaps using Bluefish which can point out problems with a plug-in or two installed in the program.

Hope this helps some.

Last edited by tronayne; 05-19-2012 at 10:33 AM. Reason: Can't type.
 
Old 05-22-2012, 08:09 AM   #3
transient
LQ Newbie
 
Registered: Aug 2011
Posts: 17

Original Poster
Rep: Reputation: Disabled
Thanks for the input tronayne. I already see that your default configuration has a smaller keepalive interval than mine. I didn't edit that value so I wonder if there are different defaults based on the distribution or depending on whether or not you install from source vs. repository. I'm gonna try reducing my timeout interval to 5. I will also take a look at ntop, which is not something I've used before. Thanks for the suggestion; I feel like there are a ton of utilities out there and I'll never know them all.

Also, I have definitely seen attempts from China and Japan to access our site (lots of fishing for phpmyadmin and various files under /var/www that I assume can be default on some systems). I've been blocking those ranges at the firewall level (ASA) in chunks. Do each of those requests open an Apache process (and keeps it open for the keepalive time), even if there is no data being served? Could that also explain the server-status entries with no IP addresses and the long hold times (which is 15456 days and clearly incorrect since the server hasn't been running that long)?
 
Old 05-23-2012, 09:34 AM   #4
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 2,866

Rep: Reputation: 698Reputation: 698Reputation: 698Reputation: 698Reputation: 698Reputation: 698
It usually turns out that the default values are pretty good at keeping things under control (which is why I don't ever mess with them unless there's a darned good reason for doing so, eh?). Can't hurt.

One thing that I've had running for years is DenyHosts (http://denyhosts.sourceforge.net/), "DenyHosts is a script intended to be run by Linux system administrators to help thwart SSH server attacks (also known as dictionary based attacks and brute force attacks)." There are truly a massing amount of those happening all the time, many of which are aimed at things like phpMyAdmin.

DenyHosts is dynamic -- meaning you don't have to fool with it. It's looking at your logs and detects break-in attempts, making either IPTABLES or /etc/hosts.deny entries that will refuse any further connections from a "bad" IP address. It also, if you want, shares addresses with other DenyHosts sites around the world and records those in your /etc/hosts.deny file (the effective and easy way) of other users' experience.

Where country blocks are effective at blocking the entire country, DenyHosts is effective at blocking the bad actors from both a country IP address and any compromised Windows machines being used to hide behind (which also gets some dodo brain's PC address in Spokane being used in attacks). You can also set it up to send you mail of what's going on.

Might be worth your time to have a look-see.

When you're blocking at /etc/hosts.deny or IPTABLS the attacker doesn't get to Apache; check your access_log and error_log files (probably in /var/log/httpd?). If you see attacks in there, take a look at managing access with htaccess, but you're really better off with IPTABLES or /etc/hosts.deny which are at the network interface rather than the Apache interface.

Also, get the update for HTTPD -- that's one you really need to do.

And, take a look at your traffic analysis with NTOP, as well as the access_log and the error_log which will help identify problem areas.

Hope this helps some.
 
  


Reply

Tags
apache2, ubuntu


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Server Hung due to hda: status timeout: status=0xd0 { Busy } adamalic Linux - Enterprise 3 12-15-2011 01:27 PM
Server status command output getting dumped to smtp server instead of a file VC0041098 Linux - Server 2 05-26-2010 11:00 AM
GRE keepalives. How can a SUSE 9.0 Linux box send GRE keepalives? dlef Linux - Networking 1 06-28-2005 12:00 PM
Gentoo- keyboard and x server misconfigured Adony Linux - Software 2 10-20-2004 03:59 PM
MIT has a misconfigured server OlRoy Linux - Security 2 03-29-2004 12:26 AM


All times are GMT -5. The time now is 04:04 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration