Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
after 2 days of searching I could not find an answer for how to interpret the following figures.
This output is from a server running Red Hat Enterprise Linux Server release 5.4 64 bit.
It hosts an oracle database which is running big jobs at night using lots of CPU and IO Resources.
This is part of a "sar -d" output I'd like to understand and get a conclusion about IO saturation:
as one can see there are high await times ( service time + queue time ) in comparison to low svctm times ( just the device service times )
I'd like to understand what makes the IOs waiting in the queue so long. ?
The svctm time is on average under 1 ms that should give the system the possibility of doing 1000 IO per second. So what could be a reason that so many IOs are queued for such a long time ?
Hi,
after 2 days of searching I could not find an answer for how to interpret the following figures.
Since you're using Red Hat Enterprise with Oracle, you should just call Oracle and Red Hat support. That's part of what you're paying for, is their tech support and diagnostic services. They can instruct you how to run traces/dumps, and will analyze them, to tell you what's going on.
Quote:
This output is from a server running Red Hat Enterprise Linux Server release 5.4 64 bit. It hosts an oracle database which is running big jobs at night using lots of CPU and IO Resources. This is part of a "sar -d" output I'd like to understand and get a conclusion about IO saturation:
as one can see there are high await times ( service time + queue time ) in comparison to low svctm times ( just the device service times )
I'd like to understand what makes the IOs waiting in the queue so long? The svctm time is on average under 1 ms that should give the system the possibility of doing 1000 IO per second. So what could be a reason that so many IOs are queued for such a long time ?
Without knowing anything about the system (what kind of disk(s)? Attached how? RAID level? Hardware or software RAID? Server memory? Number of users? database size? Number of queries? Tables? Anything??), it's impossible to say. Could be that one table of the database is getting hammered...could be that there is a poorly formatted query somewhere. Without details, it's impossible to say.
Again, the best/easiest way to figure out performance problems on an Oracle/Red Hat system, is to first contact Oracle support...they will run traces/dumps, and give you information as to where the bottlneck may be. From there, you can either fix it (if its an Oracle/query issue), or take that info to Red Hat for help with how to spread things out and/or optimize your system.
the database was just a background information about why there are so many IOs.
I 'd like to understand: why can an IO request remain so long in the queue, when the disk subsystem seems pretty fast. ( see values of wait and svctm ) - This is not related to database or Oracle specific products.
I suppose it' related to interrupts but I'm still investigating.
Hi,
the database was just a background information about why there are so many IOs.
I 'd like to understand: why can an IO request remain so long in the queue, when the disk subsystem seems pretty fast. ( see values of wait and svctm ) - This is not related to database or Oracle specific products.
I suppose it' related to interrupts but I'm still investigating.
Again, without knowing what the system is doing, who can tell? You could have programs that aren't behaving correctly, or a slew of other reasons. Again, Oracle and Red Hat can tell you exactly why.
Maybe my question was misleading so I'd like to clarify.
Actually my question is not about " help me make my system go faster " ( like calling support ) but I'd like to understand how Linux works. I was curious about the time discrepancy between the await and svctm.
Most information and articles I found say that interrupts ( a particular interrupt ) should not be spread among all CPUs because the interrupt routine would have to be loaded after each context switch.
Most articles point into direction that balancing IRQ is NOT needed, because the /proc/irq/default_smp_affinity or irqbalance daemon can distribute IRQ signals among CPUs automatically.
here is a example of the distribution on this machine:
If you take a look at IRQ 51 : It's distributed over all CPUs but very skew. The process irqbalance should be distributing the IRQs over all CPUs, so why is that skew ? Is probably ksoftirqd doing a bad job queuing the interrupts ?
So how does IRQ process affinity and the job of irqbalance fit together ?
I suspect the high await times are caused by heavy irq rates not distributed over all CPUs, but that is just a guess at the moment
If someone knows a good article or book which bring the topics IO-Waits and interrupts together please let me know .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.