Disk I/O Bottlekneck. Need tuning advice
I wrote a web app that writes a lot of data to disk. I recently observed the system intermittently hiccupping, blocking all apache processes for a short (though noticable) period of time.
I'm running debian sarge on a 2.4 kernel. I have virtually no experience tuning disk i/o in linux. I'm not sure if using elvtune or if tweaking /proc/sys options can help/fix it. The other two obvious fixes are 1. better HW and 2. Look for code optimizations. Any advice would be *greatly* appreciated. Below is output of vmstat showing the hiccupping occuring every 3-4 seconds (when bo is 1000+). Code:
vmstat 1 |
My observations:
Each occurrence of bo greater than 1000 is followed immediately by another one. In each pair, the first occurrence of bo greater than 1000 also has a large number of processes being blocked (second column with the heading "b"). In all cases when there is a large number of processes being blocked there are zero processes that are in the run queue (first column with the heading "r"). There is one occurrence where there are twenty processes in the run queue. That's a lot. In all cases the i/o wait is zero (right most column with the heading "wa"). No swapping is occurring. The CPU is idle a lot of the time. Having more than one or two processes being blocked is very unusual. Having more than one or two processes in the run queue is also very unusual. So I am wondering if your web application spawns many child processes. If your web application spawned a lot of child processes that were all contending for access to a single resource then that might explain why you have so many processes being blocked. That might be enough information for you to figure out the cause of the problem. It sounds like you either have to reduce the number of processes accessing the disk or file simultaneously or spread out the i/o over more disks. The fact that the i/o wait is zero makes me think that more disks won't help. Maybe the resource is a log file or a data file that many processes are trying to read or write simultaneously. Often when many processes have to read or write a single file there is a controller process that manages that access. You may have to redesign your application to include using a database server in order to accomplish this shared access to a single file. If I were you I would install the sar utility and run the sar data collector (sadc) every ten minutes. It will make binary data files of resource usage which you can then read. There is a wonderful application called KSar that makes graphs of sar data files. You would be able to see what resources are depleted when the bo is greater than 1000 and there are more than 2 processes being blocked. The sar utility comes in the systats (sysstats?) package. KSar can be found at http://sourceforge.net/search/?type_...oft&words=ksar More information about sadc is available as a man page once you have installed the systats (sysstats?) package. You run the sadc utility via cron. |
I don't know 2.4 at all, but that I/O profile looks like the standard 5 second write cycle. You just have too much to get done in a second.
In 2.6 you could perhaps pick another I/O scheduler, and/or reduce the 5 second lag (have a look at /proc/sys/vm/dirty_expire_centisecs). Less I/O would be the best objective. Next would be better spread of I/O - that probably means more disks, on more/separate paths. You need to get those I/Os completed faster and more consistently. That "b" column isn't really traditional blocked processes - it's processes in uninterruptable sleep. Waiting on (physical) I/O in this case. Qualifies as "blocked", but not in the usual semaphore/mutex programming sense. Your processes stall at the 5 second boundary because the physical I/O hasn't signalled completion. Have a look at top - reverse sort on process status, and I'll bet you'll see all those guys go to status "D" at the stall point(s). |
All times are GMT -5. The time now is 06:37 PM. |