sar disk activity question - await svctm
Hi,
after 2 days of searching I could not find an answer for how to interpret the following figures. This output is from a server running Red Hat Enterprise Linux Server release 5.4 64 bit. It hosts an oracle database which is running big jobs at night using lots of CPU and IO Resources. This is part of a "sar -d" output I'd like to understand and get a conclusion about IO saturation: Code:
as one can see there are high await times ( service time + queue time ) in comparison to low svctm times ( just the device service times ) I'd like to understand what makes the IOs waiting in the queue so long. ? The svctm time is on average under 1 ms that should give the system the possibility of doing 1000 IO per second. So what could be a reason that so many IOs are queued for such a long time ? any ideas welcome ! best regards guenter |
Quote:
Quote:
Again, the best/easiest way to figure out performance problems on an Oracle/Red Hat system, is to first contact Oracle support...they will run traces/dumps, and give you information as to where the bottlneck may be. From there, you can either fix it (if its an Oracle/query issue), or take that info to Red Hat for help with how to spread things out and/or optimize your system. |
Hi,
the database was just a background information about why there are so many IOs. I 'd like to understand: why can an IO request remain so long in the queue, when the disk subsystem seems pretty fast. ( see values of wait and svctm ) - This is not related to database or Oracle specific products. I suppose it' related to interrupts but I'm still investigating. best regards |
Quote:
|
Quote:
|
formatting looks now better ( thanks for the hint )
|
Hi,
Maybe my question was misleading so I'd like to clarify. Actually my question is not about " help me make my system go faster " ( like calling support ) but I'd like to understand how Linux works. I was curious about the time discrepancy between the await and svctm. I already search on the internet and found some good information on http://tldp.org/LDP/tlk/tlk-toc.html and http://www.alexonlinux.com/smp-affin...dling-in-linux http://http://honglus.blogspot.co.at...-affinity.html. Most information and articles I found say that interrupts ( a particular interrupt ) should not be spread among all CPUs because the interrupt routine would have to be loaded after each context switch. Most articles point into direction that balancing IRQ is NOT needed, because the /proc/irq/default_smp_affinity or irqbalance daemon can distribute IRQ signals among CPUs automatically. here is a example of the distribution on this machine: Code:
cat /proc/interrupts So how does IRQ process affinity and the job of irqbalance fit together ? I suspect the high await times are caused by heavy irq rates not distributed over all CPUs, but that is just a guess at the moment If someone knows a good article or book which bring the topics IO-Waits and interrupts together please let me know . best regards |
All times are GMT -5. The time now is 03:11 AM. |