LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 11-13-2010, 01:23 PM   #1
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Rep: Reputation: 0
High load per hour


Hi, yesterday install on server debian 5 (before installed ubuntu 10.10 with same problem http://ubuntuforums.org/showthread.php?t=1478946&page=1 ). Always per hour increase load from 0.20-0.30 to 50-100 (after 2-3 min reduced). Kernel 2.6.26-amd64 and 2.6.32-amd64 (backports). In HTOP all cores in IOWAIT. IOSTAT print more utilization on system hdd

HW: Intel Core i7, WD Raptor, 8GB ram. LAMP server
 
Old 11-13-2010, 03:34 PM   #2
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
Quote:
Originally Posted by masterdead View Post
Hi, yesterday install on server debian 5 (before installed ubuntu 10.10 with same problem
So the problem remains even when new software is installed. Of course the new software may have the same kernel so it is not completely new software BUT this suggests that the problem is with hardware.

Quote:
Originally Posted by masterdead View Post
Always per hour increase load from 0.20-0.30 to 50-100 (after 2-3 min reduced).
I don't understand the 50-100 part. I've never seen a workload report go much above 2 as reported in the uptime utility.

Quote:
Originally Posted by masterdead View Post
In HTOP all cores in IOWAIT. IOSTAT print more utilization on system hdd
iowait means that a hardware device isn't responding fast enough. Either you are putting more workload on the disk than it can handle OR the disk is starting to fail.

If the disk drive is okay then you need to add more disk drives and spread the workload between them, probably using RAID 0. If you have a really high workload, and your post suggests that you don't, but if you did, then 100% hardware RAID would be the way to go. Not the cheap fake RAID cards but one of the very expensive hardware RAID cards.

It is more likely that the disk drive is starting to fail so you need to replace the disk drive. You can use a disk drive testing package to see if the disk drive is failing.
http://www.hitachigst.com/support/downloads
or
http://support.wdc.com/product/download.asp?lang=en

Quote:
Originally Posted by masterdead View Post
HW: Intel Core i7, WD Raptor, 8GB ram. LAMP server
WD disks stink. Use Seagate Barracuda or Maxtor Atlas.

Last edited by stress_junkie; 11-13-2010 at 03:49 PM.
 
1 members found this post helpful.
Old 11-13-2010, 04:53 PM   #3
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
server to work correctly (Ubuntu 9.10), in April I upgraded to ubuntu 10.04 and when this problem occurred, I tried the 10.10 but is still the same.Click image for larger version

Name:	load-day.png
Views:	42
Size:	15.4 KB
ID:	5176
 
Old 11-13-2010, 05:03 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,392

Rep: Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191
I wouldn't be too concerned about the disks yet. Check cron to see what is being scheduled at those times.
 
Old 11-14-2010, 06:16 AM   #5
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
Click image for larger version

Name:	load-day.png
Views:	35
Size:	15.9 KB
ID:	5177
cron is clear, on server installed only apache2, php, mysql.
how to find out what it overloads?
every per hour load increase to 50, iotop shows apache&mysql, but cpu isnt overload, only system load too high

any ideas?
 
Old 11-14-2010, 08:28 AM   #6
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
Quote:
Originally Posted by masterdead View Post
cron is clear, on server installed only apache2, php, mysql.
how to find out what it overloads?
every per hour load increase to 50, iotop shows apache&mysql, but cpu isnt overload, only system load too high

any ideas?
That looks completely different than I had imagined. I would say you don't have a problem. Relax. Be happy. Don't worry. Maybe the system is running updatedb or something similar. Why bother worrying about it? It's definitely NOT overloaded. (Of course that's a judgment call but I think most people would agree with me.) To me an overloaded system is one that cannot adequately service the workload that it is asked to perform. There is no indication that your system has too much work to perform adequately.

If the system is running updatedb then you can expect a little bit of iowait. The updatedb is scanning the entire file system as fast as it can. The same thing would be true if you were performing some kind of backups (tar, rsync, ...). Some automatic security software may be checking file permissions. I think SELinux and/or AppArmor have a scheduled file system check. I used to play with that when I was learning about system hardening.

Last edited by stress_junkie; 11-14-2010 at 08:40 AM.
 
Old 11-14-2010, 01:27 PM   #7
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
This is log from ATOP while load increase
Code:
ATOP - homer              2010/11/14  19:45:29               10 seconds elapsed
PRC | sys   0.23s | user   1.58s | #proc    635 | #zombie    0 | #exit     53 |
CPU | sys      5% | user     18% | irq       1% | idle    405% | wait    371% |
cpu | sys      2% | user      5% | irq       1% | idle     61% | cpu002 w 30% |
cpu | sys      1% | user      8% | irq       0% | idle     16% | cpu000 w 75% |
cpu | sys      0% | user      1% | irq       0% | idle     99% | cpu007 w  0% |
cpu | sys      0% | user      0% | irq       0% | idle     99% | cpu006 w  0% |
cpu | sys      0% | user      2% | irq       0% | idle     87% | cpu001 w 11% |
cpu | sys      1% | user      1% | irq       0% | idle      0% | cpu003 w 98% |
cpu | sys      0% | user      0% | irq       0% | idle      0% | cpu005 w 99% |
cpu | sys      0% | user      0% | irq       0% | idle     47% | cpu004 w 53% |
CPL | avg1   8.83 | avg5    5.66 | avg15   2.41 | csw    14249 | intr   18474 |
MEM | tot    7.8G | free    3.0G | cache   2.9G | buff  261.4M | slab  197.6M |
SWP | tot    5.7G | free    5.7G |              | vmcom   7.3G | vmlim   9.6G |
DSK |         sda | busy    100% | read       6 | write      4 | avio 1000 ms |
DSK |         sdc | busy      0% | read       0 | write      5 | avio    5 ms |
NET | transport   | tcpi    5282 | tcpo    3616 | udpi       0 | udpo       0 |
NET | network     | ipi     5340 | ipo     3761 | ipfrw      0 | deliv   5291 |
NET | eth0     8% | pcki    5458 | pcko    7777 | si  994 Kbps | so 8011 Kbps |
system disk SDA is busy 100%, but who process use system disk?
 
Old 11-14-2010, 02:29 PM   #8
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
If you press d while atop is running then you should be able to see which process is creating the disk activity.
Code:
man atop
 
1 members found this post helpful.
Old 11-15-2010, 03:37 AM   #9
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
during increasing load ATOP shows mysql&apache using disk 100%
mysql showprocesslist printing more and more sessions with lock and system lock
This reflects the fact?

While load i kill mysql server, but the load continued to increase! Not mysql, not apache, not cron service.

Last edited by masterdead; 11-15-2010 at 03:54 AM.
 
Old 11-15-2010, 02:28 PM   #10
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,392

Rep: Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191
Run this when the load is (very) high
Code:
top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D: "count}'
 
Old 11-16-2010, 01:17 AM   #11
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
Code:
top - 08:14:56 up 2 days, 13:04,  4 users,  load average: 11.72, 3.71, 1.34
Tasks: 311 total,   1 running, 310 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.1%us,  0.3%sy,  0.1%ni, 96.4%id,  2.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   8188432k total,  6428040k used,  1760392k free,   438044k buffers
Swap:  5992204k total,        0k used,  5992204k free,  4693016k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  702 root      20   0     0    0    0 D    0  0.0   0:12.38 kjournald
 1583 root      20   0     0    0    0 D    0  0.0   0:08.02 flush-8:0

top - 08:16:40 up 2 days, 13:05,  4 users,  load average: 64.40, 23.81, 8.75
Tasks: 352 total,   7 running, 344 sleeping,   0 stopped,   1 zombie
Cpu(s):  1.1%us,  0.3%sy,  0.1%ni, 96.4%id,  2.1%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   8188432k total,  6514480k used,  1673952k free,   438052k buffers
Swap:  5992204k total,        0k used,  5992204k free,  4693108k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
Total status D:
 
Old 11-16-2010, 02:50 AM   #12
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,392

Rep: Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191Reputation: 4191
Stick it in a loop, and append to a log file. Can't explain that.
 
Old 11-17-2010, 04:31 AM   #13
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
Code:
Nov 16 23:03:18 hojko kernel: [ 3833.584580] INFO: task mysqld:25098 blocked for more than 120 seconds.
Nov 16 23:03:18 hojko kernel: [ 3833.584639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 16 23:03:18 hojko kernel: [ 3833.584695] mysqld        D ffff880008d95780     0 25098   1920 0x00000000
Nov 16 23:03:18 hojko kernel: [ 3833.584698]  ffff8801ec03bf90 0000000000000086 0000000000003e00 00000000000044d0
Nov 16 23:03:18 hojko kernel: [ 3833.584701]  ffffffff810c7124 000000000000f9e0 ffff8801ee17ffd8 0000000000015780
Nov 16 23:03:18 hojko kernel: [ 3833.584703]  0000000000015780 ffff88021c66d4c0 ffff88021c66d7b8 0000000600000000
Nov 16 23:03:18 hojko kernel: [ 3833.584706] Call Trace:
Nov 16 23:03:18 hojko kernel: [ 3833.584712]  [<ffffffff810c7124>] ? zone_statistics+0x3c/0x5d
Nov 16 23:03:18 hojko kernel: [ 3833.584716]  [<ffffffff81016581>] ? read_tsc+0xa/0x20
Nov 16 23:03:18 hojko kernel: [ 3833.584719]  [<ffffffff8110d986>] ? sync_buffer+0x0/0x40
Nov 16 23:03:18 hojko kernel: [ 3833.584722]  [<ffffffff812f9dc8>] ? io_schedule+0x73/0xb7
Nov 16 23:03:18 hojko kernel: [ 3833.584724]  [<ffffffff8110d9c1>] ? sync_buffer+0x3b/0x40
Nov 16 23:03:18 hojko kernel: [ 3833.584726]  [<ffffffff812fa1d8>] ? __wait_on_bit_lock+0x3f/0x84
Nov 16 23:03:18 hojko kernel: [ 3833.584728]  [<ffffffff8110d986>] ? sync_buffer+0x0/0x40
Nov 16 23:03:18 hojko kernel: [ 3833.584730]  [<ffffffff812fa288>] ? out_of_line_wait_on_bit_lock+0x6b/0x77
Nov 16 23:03:18 hojko kernel: [ 3833.584733]  [<ffffffff81064af0>] ? wake_bit_function+0x0/0x23
Nov 16 23:03:18 hojko kernel: [ 3833.584735]  [<ffffffff8110ddb3>] ? sync_dirty_buffer+0x29/0x93
Nov 16 23:03:18 hojko kernel: [ 3833.584741]  [<ffffffffa00d8e00>] ? journal_dirty_data+0xd1/0x1b0 [jbd]
Nov 16 23:03:18 hojko kernel: [ 3833.584745]  [<ffffffffa00eef1f>] ? ext3_journal_dirty_data+0xf/0x34 [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584749]  [<ffffffffa00ed3f9>] ? walk_page_buffers+0x65/0x8b [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584752]  [<ffffffffa00eef44>] ? journal_dirty_data_fn+0x0/0x13 [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584756]  [<ffffffffa00f0a42>] ? ext3_ordered_write_end+0x73/0x10f [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584760]  [<ffffffff810b4901>] ? generic_file_buffered_write+0x18d/0x278
Nov 16 23:03:18 hojko kernel: [ 3833.584763]  [<ffffffff810b4d9d>] ? __generic_file_aio_write+0x25f/0x293
Nov 16 23:03:18 hojko kernel: [ 3833.584766]  [<ffffffff8123fb5c>] ? sock_aio_write+0xb1/0xbc
Nov 16 23:03:18 hojko kernel: [ 3833.584769]  [<ffffffff810b4e2a>] ? generic_file_aio_write+0x59/0x9f
Nov 16 23:03:18 hojko kernel: [ 3833.584772]  [<ffffffff810ee3a6>] ? do_sync_write+0xce/0x113
Nov 16 23:03:18 hojko kernel: [ 3833.584775]  [<ffffffff81102e15>] ? mntput_no_expire+0x23/0xee
Nov 16 23:03:18 hojko kernel: [ 3833.584777]  [<ffffffff81064ac2>] ? autoremove_wake_function+0x0/0x2e
Nov 16 23:03:18 hojko kernel: [ 3833.584780]  [<ffffffff810eed1e>] ? vfs_write+0xa9/0x102
Nov 16 23:03:18 hojko kernel: [ 3833.584782]  [<ffffffff810eedce>] ? sys_pwrite64+0x57/0x77
Nov 16 23:03:18 hojko kernel: [ 3833.584784]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Nov 16 23:03:18 hojko kernel: [ 3833.584787] INFO: task mysqld:25409 blocked for more than 120 seconds.
Nov 16 23:03:18 hojko kernel: [ 3833.584819] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 16 23:03:18 hojko kernel: [ 3833.584868] mysqld        D ffff880008d95780     0 25409   1920 0x00000000
Nov 16 23:03:18 hojko kernel: [ 3833.584870]  ffff88021c66aa60 0000000000000086 0000000000002640 00000000000044d0
Nov 16 23:03:18 hojko kernel: [ 3833.584873]  ffffffff810c7124 000000000000f9e0 ffff8801f4afffd8 0000000000015780
Nov 16 23:03:18 hojko kernel: [ 3833.584875]  0000000000015780 ffff8801ede8b880 ffff8801ede8bb78 0000000600000000
Nov 16 23:03:18 hojko kernel: [ 3833.584877] Call Trace:
Nov 16 23:03:18 hojko kernel: [ 3833.584879]  [<ffffffff810c7124>] ? zone_statistics+0x3c/0x5d
Nov 16 23:03:18 hojko kernel: [ 3833.584882]  [<ffffffff81016581>] ? read_tsc+0xa/0x20
Nov 16 23:03:18 hojko kernel: [ 3833.584884]  [<ffffffff8110d986>] ? sync_buffer+0x0/0x40
Nov 16 23:03:18 hojko kernel: [ 3833.584886]  [<ffffffff812f9dc8>] ? io_schedule+0x73/0xb7
Nov 16 23:03:18 hojko kernel: [ 3833.584888]  [<ffffffff8110d9c1>] ? sync_buffer+0x3b/0x40
Nov 16 23:03:18 hojko kernel: [ 3833.584890]  [<ffffffff812fa1d8>] ? __wait_on_bit_lock+0x3f/0x84
Nov 16 23:03:18 hojko kernel: [ 3833.584892]  [<ffffffff8110d986>] ? sync_buffer+0x0/0x40
Nov 16 23:03:18 hojko kernel: [ 3833.584894]  [<ffffffff812fa288>] ? out_of_line_wait_on_bit_lock+0x6b/0x77
Nov 16 23:03:18 hojko kernel: [ 3833.584896]  [<ffffffff81064af0>] ? wake_bit_function+0x0/0x23
Nov 16 23:03:18 hojko kernel: [ 3833.584898]  [<ffffffff8110ddb3>] ? sync_dirty_buffer+0x29/0x93
Nov 16 23:03:18 hojko kernel: [ 3833.584901]  [<ffffffffa00d8e00>] ? journal_dirty_data+0xd1/0x1b0 [jbd]
Nov 16 23:03:18 hojko kernel: [ 3833.584905]  [<ffffffffa00eef1f>] ? ext3_journal_dirty_data+0xf/0x34 [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584908]  [<ffffffffa00ed3f9>] ? walk_page_buffers+0x65/0x8b [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584912]  [<ffffffffa00eef44>] ? journal_dirty_data_fn+0x0/0x13 [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584915]  [<ffffffffa00f0a42>] ? ext3_ordered_write_end+0x73/0x10f [ext3]
Nov 16 23:03:18 hojko kernel: [ 3833.584918]  [<ffffffff810b4901>] ? generic_file_buffered_write+0x18d/0x278
Nov 16 23:03:18 hojko kernel: [ 3833.584921]  [<ffffffff810b4d9d>] ? __generic_file_aio_write+0x25f/0x293
Nov 16 23:03:18 hojko kernel: [ 3833.584925]  [<ffffffff81071774>] ? wake_futex+0x31/0x4e
Nov 16 23:03:18 hojko kernel: [ 3833.584927]  [<ffffffff810b4e2a>] ? generic_file_aio_write+0x59/0x9f
Nov 16 23:03:18 hojko kernel: [ 3833.584929]  [<ffffffff810ee3a6>] ? do_sync_write+0xce/0x113
Nov 16 23:03:18 hojko kernel: [ 3833.584932]  [<ffffffff8100f5e7>] ? __switch_to+0xd0/0x297
Nov 16 23:03:18 hojko kernel: [ 3833.584934]  [<ffffffff81064ac2>] ? autoremove_wake_function+0x0/0x2e
Nov 16 23:03:18 hojko kernel: [ 3833.584938]  [<ffffffff81047ee7>] ? finish_task_switch+0x3a/0xaf
Nov 16 23:03:18 hojko kernel: [ 3833.584940]  [<ffffffff810eed1e>] ? vfs_write+0xa9/0x102
Nov 16 23:03:18 hojko kernel: [ 3833.584942]  [<ffffffff810eedce>] ? sys_pwrite64+0x57/0x77
Nov 16 23:03:18 hojko kernel: [ 3833.584944]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
Nov 17 00:01:33 hojko kernel: [ 7321.590154] elevator: type noop [anticipat not found
Nov 17 00:01:40 hojko kernel: [ 7328.576287] elevator: type noop [anticipat not found
Nov 17 00:01:47 hojko kernel: [ 7336.114344] elevator: type noop [anticipat not found
Nov 17 00:02:05 hojko kernel: [ 7353.954835] elevator: type noop [anticipat not found
Nov 17 00:23:25 hojko kernel: [ 8630.870323] elevator: type cfg not found
Nov 17 00:23:31 hojko kernel: [ 8637.425223] elevator: type cfg not found
Nov 17 00:24:13 hojko kernel: [ 8678.591840] elevator: type cfg not found
Nov 17 00:27:33 hojko kernel: [ 8878.431323] CE: hpet increasing min_delta_ns to 15000 nsec
Nov 17 01:55:29 hojko kernel: [14144.121091] CE: hpet increasing min_delta_ns to 22500 nsec
I tried to change block shceduler, but no effect
 
Old 11-19-2010, 05:38 AM   #14
masterdead
LQ Newbie
 
Registered: Nov 2010
Posts: 8

Original Poster
Rep: Reputation: 0
Someone idea to fix it?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
High load and high cpu kernel usage enid Linux - Server 8 09-30-2010 03:33 AM
Cron job issue - every hour works, but specific hour fails lunarleviathan Linux - Newbie 6 11-20-2009 12:19 AM
High load, high RAM usage and unresponsive VPS saeed22 Linux - Server 1 08-20-2009 11:58 AM
Load Avg High/Phys Mem High teamh Debian 2 12-26-2006 05:03 PM
change clock from 24 hour to 12 hour in suse 9.2/KDE 3.3 jmlumpkin Linux - Newbie 1 01-22-2005 11:45 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:15 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration