LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Issues with mysqld, httpd, io r/w, cpu usage on CentOS 6.4 (https://www.linuxquestions.org/questions/linux-server-73/issues-with-mysqld-httpd-io-r-w-cpu-usage-on-centos-6-4-a-4175478021/)

Dan666 09-21-2013 01:56 PM

Issues with mysqld, httpd, io r/w, cpu usage on CentOS 6.4
 
Hey all,

I'm having load times issues with a few sites I'm hosting on a dedicated server. I'll try to be as detailed as I can.

I'm hosting a few sites (phpbb forum and some other simple sites) on a Centos 6.4, with 8GB of ram and a Quad Core processor. Two hard disks in software raid1.
A few days ago I had to move the machine from one DC to another (the new one is better :) and since then the issues occurred.
Currently I'm having slow loading times on almost all sites, mainly on the phpbb forum (forum is quite big - ~10.000 registered, 250.000 posts).
My php version is 5.4.16, mysql is 5.5.32 and apache is 2.4.0.
I've been researching almost everything on the machine and I found a few... lets say symptoms that might be causing this issue, but I don't know how to proceed in investigating them.

I started monitoring memory usage, but there are no issues there.
I started recieving the following reports:

Code:

Time:                    Sat Sep 21 20:12:53 2013 +0300
1 Min Load Avg:          10.22
5 Min Load Avg:          6.17
15 Min Load Avg:        4.81
Running/Total Processes: 2/311

CPU usage for one of the cores starts going up to 100% for 1 or 2 seconds and than drops down again. Most of the times it's a HTTPD process.
I also monitored iotop and this caught my attention, although i have no idea what it means:

Code:

  958 be/3 root        0.00 B/s  31.26 K/s  0.00 % 99.99 % [jbd2/dm-2-8]
  457 be/3 root        0.00 B/s    0.00 B/s  0.00 % 99.99 % [jbd2/dm-1-8]

These are my conf file cofigurations:

Code:

<IfModule prefork.c>
StartServers      8
MinSpareServers    5
MaxSpareServers  20
ServerLimit      256
MaxClients      256
MaxRequestsPerChild  20
</IfModule>

<IfModule worker.c>
StartServers        4
MaxClients        100
MinSpareThreads    20
MaxSpareThreads    50
ThreadsPerChild    25
MaxRequestsPerChild  0
</IfModule>

Code:

#Mysql Tunning
max_connections = 500
join_buffer_size = 512M
tmp_table_size = 24M
max_heap_table_size = 24M
query_cache_size = 1024M
query_cache_limit = 4M
#log_query_time = 2
key_buffer = 256M
key_buffer_size = 1332M
thread_cache_size = 16K
table_cache = 512K
table_definition_cache = 4K
open_files_limit = 3K
table_open_cache = 786
tmp_table_size = 256M
max_heap_table_size = 256M
innodb_buffer_pool_size = 768M
local-infile=0
low_priority_update=1
concurrent_insert=ALWAYS

httpd access-log is flooded with these kind of messages, but from what i found on the net, this is normal:

Code:

::1 - - [21/Sep/2013:21:55:28 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:55:29 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:55:30 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:55:31 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:55:35 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:55:40 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:57:13 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:57:14 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:57:15 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"
::1 - - [21/Sep/2013:21:57:17 +0300] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.2.15 (CentOS) (internal dummy connection)"

Netstat shows a lot of this:

Code:

tcp        0      0 my.ip:40100        my.ip:80            TIME_WAIT 
tcp        0      0 my.ip:40101        my.ip:80            TIME_WAIT 
tcp        0      0 my.ip:40102        my.ip:80            TIME_WAIT 
tcp        0      0 my.ip:40103        my.ip:80            TIME_WAIT 
tcp        0      0 my.ip:40107        my.ip:80            TIME_WAIT 
tcp        0      0 my.ip:40108        my.ip:80            TIME_WAIT 
tcp        0      0 my.ip:40109        my.ip:80            TIME_WAIT

And

Code:

netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
    13 ESTABLISHED
      8 FIN_WAIT2
    19 LAST_ACK
      8 LISTEN
    97 TIME_WAIT

I also noticed something strange, that didn't happen before. I don't know if it is relevant or not. Sometimes the SSH session just hang for a few (1-4 seconds). I mean while writing, moving up and down pages, etc...

So I've been banging my head with this issue for the last 4-5 days and I'm at a dead end. No idea what to do or what is causing this, since no other changes were made except the migration.

KoopaTroopa 09-21-2013 04:51 PM

Do a netstat -nap and find out what process is having your server connect to itself and kill itself.

Dan666 09-22-2013 02:58 AM

A lot of these:

Code:

tcp        0      0 my.ip:41412        my.ip:http          TIME_WAIT  -                 
tcp        0      0 my.ip:41441        my.ip:http          TIME_WAIT  -                 
tcp        0      0 my.ip:53667        my.ip:http          TIME_WAIT  -

All http processes.

Edit: Ok. So I could officially rule out an issue with the mysql database. I moved the most loaded DB on another server, loading times for the DB went from 7s to 0.36s, but the site (forum in my case) still loads extremely slow.

Dan666 09-24-2013 03:51 PM

Just to update.
A failing hard drive was the reason. Removed the old HDD's in raid1 and put a SSD. Now everything's perfect.


All times are GMT -5. The time now is 04:52 AM.