Help | High waiting I/O (%wa)
I have high %wa in the top which is the waiting IO or whatever,
I can't find any info on how to scan for the processes that consume it, any way this is my top: top - 18:58:01 up 20:05, 1 user, load average: 15.21, 16.36, 15.72 Tasks: 306 total, 18 running, 288 sleeping, 0 stopped, 0 zombie Cpu(s): 16.7%us, 2.8%sy, 0.0%ni, 43.8%id, 35.9%wa, 0.1%hi, 0.7%si, 0.0%st Mem: 4138236k total, 3961636k used, 176600k free, 2248k buffers Swap: 18808816k total, 11966424k used, 6842392k free, 127800k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1214 meron 15 0 275m 33m 2304 R 11.9 0.8 15:02.58 srcds_i686 22199 yuval 15 0 306m 69m 2648 S 10.9 1.7 19:45.31 srcds_i686 11597 benkia 15 0 233m 20m 1352 R 6.6 0.5 4:47.40 srcds_i686 237 root 10 -5 0 0 0 R 5.6 0.0 7:18.12 kswapd0 5882 dormaor 15 0 138m 26m 3068 S 4.3 0.7 23:15.54 hlds_i686 24112 oren258 15 0 129m 71m 3308 R 4.0 1.8 7:37.56 hlds_i686 13199 oz14789 15 0 153m 29m 3064 R 3.3 0.7 17:05.94 hlds_i686 5916 dormaor 15 0 128m 15m 2952 S 2.7 0.4 8:25.90 hlds_i686 1702 emi1121 15 0 126m 10m 2704 R 2.0 0.3 13:40.35 hlds_i686 3505 devilhen 15 0 139m 11m 2688 S 2.0 0.3 13:30.50 hlds_i686 4956 oren258 15 0 1534m 19m 2960 S 2.0 0.5 7:11.25 hlds_i686 4034 liormich 18 0 142m 51m 3268 D 1.7 1.3 16:11.55 hlds_i686 4241 adir2612 15 0 81064 5516 2232 S 1.0 0.1 7:24.12 hlds_i686 4541 champion 15 0 219m 4896 900 S 1.0 0.1 37:08.12 srcds_i686 9906 adir2612 15 0 94168 8596 2464 S 1.0 0.2 46:25.03 hlds_i686 12456 omri1213 15 0 117m 4828 2316 R 1.0 0.1 33:47.80 hlds_i686 13358 doron 15 0 110m 5092 2148 S 1.0 0.1 5:21.02 hlds_i686 2900 afekio 15 0 215m 9644 920 S 0.7 0.2 7:26.85 srcds_i686 3380 devilhen 15 0 111m 51m 2648 S 0.7 1.3 6:47.65 hlds_i686 3402 devilhen 15 0 110m 3280 1408 R 0.7 0.1 5:21.14 hlds_i686 3504 devilhen 15 0 106m 4288 1496 S 0.7 0.1 7:18.31 hlds_i686 3653 oz14789 15 0 1529m 8500 1688 S 0.7 0.2 7:24.48 hlds_i686 3673 oz14789 15 0 119m 17m 2392 S 0.7 0.4 13:03.49 hlds_i686 6068 adamste 15 0 82660 3764 2400 R 0.7 0.1 10:30.22 hlds_i686 6411 boomday 15 0 211m 9856 964 S 0.7 0.2 8:29.73 srcds_i686 6425 vrstop 15 0 115m 5492 2332 S 0.7 0.1 7:58.03 hlds_i686 10093 adir2612 15 0 93356 4532 2592 R 0.7 0.1 10:35.20 hlds_i686 11642 wiliam 15 0 256m 6188 1148 S 0.7 0.1 9:06.76 srcds_i686 12041 adir2612 15 0 98868 40m 3220 S 0.7 1.0 18:28.87 hlds_i686 12266 xapystyl 15 0 231m 7064 964 S 0.7 0.2 32:08.49 srcds_i686 12958 faintly 15 0 127m 7776 2788 R 0.7 0.2 8:40.29 hlds_i686 14282 oz14789 15 0 123m 11m 1928 R 0.7 0.3 8:13.15 hlds_i686 20176 amx321 15 0 1616m 118m 1608 R 0.7 2.9 1:42.29 hlds_i686 27981 devilhen 15 0 146m 6248 1684 S 0.7 0.2 27:24.39 hlds_i686 400 devilhen 15 0 84580 6280 1604 S 0.3 0.2 1:12.43 hlds_i686 1118 oren258 15 0 1525m 459m 2872 R 0.3 11.4 0:26.85 hlds_i686 3384 devilhen 15 0 99560 3256 1596 S 0.3 0.1 5:00.40 hlds_i686 3458 devilhen 15 0 99.9m 3772 1552 R 0.3 0.1 5:11.83 hlds_i686 3482 devilhen 15 0 136m 57m 2760 S 0.3 1.4 12:50.52 hlds_i686 3503 devilhen 15 0 99.1m 3364 1576 S 0.3 0.1 5:09.31 hlds_i686 3854 liormich 15 0 137m 10m 2396 S 0.3 0.3 18:27.05 hlds_i686 3996 oren258 15 0 110m 3308 1536 R 0.3 0.1 3:59.27 hlds_i686 5829 pingless 15 0 24780 2120 940 S 0.3 0.1 3:24.55 pingless4708 6301 devilhen 15 0 84816 4232 1564 S 0.3 0.1 1:10.20 hlds_i686 9046 devilhen 15 0 1492m 1.4g 3392 S 0.3 36.2 0:27.11 hlds_i686 10336 root 15 0 2328 1144 792 R 0.3 0.0 0:00.28 top 10538 mixer 15 0 20336 1880 1000 S 0.3 0.0 0:39.21 mixer5063 11345 oren258 15 0 1531m 904m 2240 S 0.3 22.4 2:20.06 hlds_i686 12145 khtcvnkl 15 0 24656 2092 908 S 0.3 0.1 2:50.79 khtcvnkll5015 12692 multiser 15 0 1496m 3524 1564 S 0.3 0.1 4:07.34 hlds_i686 12800 adir2612 15 0 65036 2576 1516 S 0.3 0.1 1:31.11 hlds_i686 15773 oz14789 15 0 152m 15m 1820 R 0.3 0.4 39:53.28 hlds_i686 19146 liormich 15 0 1521m 6356 1508 S 0.3 0.2 4:41.38 hlds_i686 I have CentOS 5 . when the %wa is getting pretty high like 60%+ it's making the server laggy... so i need to find way to check which proccesses are making this! thanks |
I always wanted to try this, but have never gotten around to it .... how about IO TOP: http://guichaz.free.fr/iotop/
|
You need to investigate the I/O profile - try the sysstat package. How much I/O spread over how many disks - where is the swap space; is it competing on the same disks as the general I/O ?.
Lots to look at. |
I have one disk,
and the swap space including extra swap space I made (16GB) is from this disk. is it possible that his making the problem? -bash-3.2# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 451G 102G 326G 24% / /dev/sda1 99M 63M 32M 67% /boot tmpfs 2.0G 0 2.0G 0% /dev/shm I used the sysstat thing and iostats with it: -bash-3.2# iostat Linux 2.6.18-128.2.1.el5PAE (server6.pingless.co.il) 07/28/2009 avg-cpu: %user %nice %system %iowait %steal %idle 22.96 0.03 12.66 20.23 0.00 44.13 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 29.16 469.11 294.23 37110052 23275716 sda1 0.00 0.03 0.00 2266 20 sda2 29.16 469.08 294.23 37107514 23275696 sdb 111.10 2535.26 2016.93 200558402 159554808 sdb1 111.10 2535.26 2016.93 200558130 159554808 dm-0 437.80 2642.44 2029.84 209037098 160575432 dm-1 80.40 361.88 281.33 28627784 22255072 I have no idea what is it means! |
Help
Another log:
top - 00:48:47 up 1 day, 1:56, 1 user, load average: 23.51, 23.21, 22.03 Tasks: 324 total, 7 running, 317 sleeping, 0 stopped, 0 zombie Cpu(s): 7.4%us, 2.6%sy, 0.0%ni, 30.2%id, 59.2%wa, 0.0%hi, 0.6%si, 0.0%st Mem: 4138236k total, 3966932k used, 171304k free, 576k buffers Swap: 18808816k total, 8176976k used, 10631840k free, 84920k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5882 dormaor 15 0 166m 32m 3012 S 5.3 0.8 44:42.34 hlds_i686 3854 liormich 15 0 173m 32m 2888 R 4.9 0.8 40:07.08 hlds_i686 25275 liormich 15 0 108m 34m 3060 S 3.0 0.9 3:43.81 hlds_i686 6155 adir2612 16 0 69516 57m 7148 D 2.0 1.4 0:01.80 hlds_i686 22102 devilhen 18 0 1532m 431m 3128 D 1.6 10.7 3:18.32 hlds_i686 6787 yuval 15 0 274m 9464 1632 S 1.0 0.2 3:55.20 srcds_i686 4541 champion 15 0 219m 4892 848 S 0.7 0.1 40:37.09 srcds_i686 4956 oren258 17 0 138m 43m 3104 D 0.7 1.1 18:28.79 hlds_i686 6726 root 15 0 2328 1116 756 R 0.7 0.0 0:00.23 top 7396 oz14789 15 0 138m 25m 3412 S 0.7 0.6 4:58.71 hlds_i686 9600 oren258 15 0 1535m 1.1g 4304 S 0.7 28.2 5:45.78 hlds_i686 9906 adir2612 15 0 94168 6684 2832 S 0.7 0.2 54:35.23 hlds_i686 11642 wiliam 15 0 256m 7300 1072 S 0.7 0.2 14:42.54 srcds_i686 400 devilhen 15 0 84580 4256 1584 S 0.3 0.1 4:05.25 hlds_i686 530 liormich 15 0 1505m 1.4g 3380 S 0.3 35.1 0:42.87 hlds_i686 1214 meron 15 0 284m 8888 1464 S 0.3 0.2 25:24.45 srcds_i686 3380 devilhen 15 0 111m 40m 2448 S 0.3 1.0 10:13.51 hlds_i686 3384 devilhen 15 0 99824 3072 1520 S 0.3 0.1 8:00.98 hlds_i686 3402 devilhen 15 0 111m 5620 1896 S 0.3 0.1 8:12.89 hlds_i686 3458 devilhen 15 0 99.9m 5564 1912 S 0.3 0.1 7:53.86 hlds_i686 3653 oz14789 15 0 1539m 5988 1724 S 0.3 0.1 10:53.96 hlds_i686 3837 liormich 15 0 116m 6044 1560 S 0.3 0.1 7:24.82 hlds_i686 6064 adir2612 15 0 4488 1192 976 S 0.3 0.0 0:00.02 hlds_run 6301 devilhen 15 0 84816 4216 1564 S 0.3 0.1 4:05.23 hlds_i686 6411 boomday 15 0 211m 7576 1048 S 0.3 0.2 12:31.73 srcds_i686 9223 adir2612 15 0 75352 9172 2628 S 0.3 0.2 3:40.23 hlds_i686 11673 oz14789 15 0 114m 21m 3252 S 0.3 0.5 3:09.01 hlds_i686 11834 xapystyl 15 0 222m 15m 1032 S 0.3 0.4 2:14.39 srcds_i686 12164 omri1213 15 0 119m 10m 2544 S 0.3 0.3 3:01.40 hlds_i686 12320 benkia 15 0 201m 8188 1332 S 0.3 0.2 0:45.23 srcds_i686 12692 multiser 15 0 1496m 3752 1632 S 0.3 0.1 6:05.41 hlds_i686 15773 oz14789 15 0 159m 12m 2148 S 0.3 0.3 53:18.03 hlds_i686 19080 khtcvnkl 15 0 77816 2732 1392 S 0.3 0.1 2:16.71 hlds_i686 22327 liormich 15 0 112m 52m 3196 R 0.3 1.3 5:16.42 hlds_i686 24117 devilhen 15 0 101m 10m 2508 S 0.3 0.3 1:38.63 hlds_i686 25834 faintly 15 0 146m 9928 2692 S 0.3 0.2 7:42.57 hlds_i686 27981 devilhen 15 0 156m 59m 2876 S 0.3 1.5 49:51.39 hlds_i686 30321 oz14789 15 0 89756 8868 3420 R 0.3 0.2 0:13.76 hlds_i686 31350 devilhen 15 0 117m 53m 2904 S 0.3 1.3 2:02.69 hlds_i686 1 root 15 0 2064 496 468 S 0.0 0.0 0:02.07 init 2 root RT -5 0 0 0 S 0.0 0.0 0:00.02 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root RT -5 0 0 0 S 0.0 0.0 0:00.04 migration/1 6 root 34 19 0 0 0 S 0.0 0.0 0:00.02 ksoftirqd/1 7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 root RT -5 0 0 0 S 0.0 0.0 0:00.03 migration/2 9 root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/2 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 root RT -5 0 0 0 S 0.0 0.0 0:00.02 migration/3 12 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3 13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 root 10 -5 0 0 0 S 0.0 0.0 0:00.08 events/0 It's like the %wa is rising for a second and the %cpu is going down weird.... |
You have posted several things which have suggested that IO Wait is on the high side, no question about that. The trouble is that just presenting another log with more evidence that IO Wait can be on the high side won't get anyone close to understanding in more depth so that we can do more than sympathise.
The output from iotop would give information about which process is doing it, which would probably be helpful (although it is a bit difficult to watch and needs some patience). The other approach is to start from the 'is it swapping because something is using up too much memory?' (and 'is it swapping?' is a question inside that, and that is not yet answered.) vmstat helps (but ignore the first result), combined with top or you could try looking at ksysguard. If any process looks to be running out of control, then you have to ask why, as that can take you further. BTW, your posts are hard to read, because the column formatting is falling all over the place. If you use code tags, this doesn't usually happen. |
Quote:
of RAM swapped out. The machine is memory starved, and the I/O wait is almost certainly caused by swapping/paging. Get more RAM (or put less stuff on the box). Cheers, Tink |
I don't think it has somthing with the servers load in the box,
I rebooted it, and run all server, and it seemes like it's connected to the kswapd0 process thing, every time I see it in the top the server getting laggy and the %wa is ascending. I decreased the swap from 16GB to 8GB, I though it might fix the problem but with no luck. I saw a lot of problems in google with kswapd0 high cpu, I think that's might be my problem. Any Idea how to solve it? I HAVE: Linux version 2.6.18-128.2.1.el5PAE (mockbuild@builder16.centos.org) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Tue Jul 14 07:15:01 EDT 2009 CentOS release 5.3 (Final) 2.6.18-128.2.1.el5PAE i686 |
Hello again...
Mate ... with 4GB of RAM and 8GB swapped out ... what do you think why swapd might be active and your I/O subsystem busy? Quote:
hungry processes. I mean ... with one process eating (almost) half your RAM by itself... Code:
3653 oz14789 15 0 1539m 5988 1724 S 0.3 0.1 10:53.96 hlds_i686 Just not without it. Do another top, sort by memory usage, post it in code tags. Cheers, Tink |
I am just posting here to follow this thread, as I have some similar i/o problems that I have been unable to fix over the years. the difference between my problem and what is discussed here is that my system randomly goes to 99% wait for less then a minute and then gets back to normal again. I wanted to place a link to a recent thread or post I did but I can't find the thread, have to go to work now.
good luck Ron |
Look at this log:
PHP Code:
so basically what's you recommend is to add memory? I believe the cpu is ok because the overload is more of a memory problem, am I correct? I am asking my supplier to increase mem to 6GB and i'll increase the swap to 12GB. just give me your opinion. and a question : what exactly the VIRT column is means? how much memory was used by the process or how much is using atm ? Update: I killed all those processes have VIRT of 1500m~ and my memory: Mem: 4147960k total, 1345740k used, 2802220k free, 8716k buffers buffers freed, more free memory and %wa stabilized: Cpu(s): 18.6%us, 1.4%sy, 0.0%ni, 77.2%id, 2.5%wa, 0.0%hi, 0.2%si, 0.0%st so I can say that the memory consumers made it. I might not need to add memory after all. steel want your opinion THANKS |
http://ubuntuforums.org/showthread.php?t=1221176
I found the other thread but no fix, not sure if I should create a new thread for this on here. |
All times are GMT -5. The time now is 10:45 AM. |