How do I identify what process are making up the WA statistic?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
1. Is that WA% only disk I/O or can it include network IO?
2. How do I identify which processes are making up that WA statistic?
3. Why does iostat report 100% utilization when the drive read/write speeds are under 1MB/s?
The reason I'm asking.
My Oracle stats are saying that the drives are running at about 5MB/s right now, the WA is 55%, and the iostat -m -x is reporting the drives running at 100% utilization with a read/write rate under 1MB/s. The read speed of the drive is 98MB/s (tested with hdparm -tT when I installed the drives)
I know I have a few queries using full tables scans which I'm working on but I'd think that 100% utilization should be running at the full 98MB/s.
I want to try to identify which process (most likely and oracle one) is using the 100% utilization/55% WA.
WA% is IO, it can be disk and/or network IO. Anything the CPU is waiting on can cause increased WA%.
Most of time, from what I have experienced, it occurs with swapping. Since the drive is apparently sort of idle, you should look at the network, however.
Code:
tcpdump -i eth0
will show you the network activity, look for ever re-occurring patterns.
Code:
lsof -i
might help you find out which processes are using the network.
Code:
netstat -i TCP
is also pretty good, if you know what is doing what with the network on your computer.
Sorry, I forgot, it may not be network related. Have you got ext3? How much free space do you have on the drives? The ext3 fs is very bad in performance once the drives start to fill up.
Do you not have the possibility to create indexes in the Oracle database, full table scans are expensive and should be avoided at all costs, especially if the table is large...
How much swap is used?
You should see that in
Code:
top
,
Code:
swapon -s
or
Code:
free
?
Which process is using most swap:
in top, press O and the o, this will order processes by virtual memory usage.
Oh, damn, I forgot the most important one here, vmstat. read its man page, no too long and very informative!
To see which applications use how much effective virtual memory, you look at the RES column in top.
You should watch out with Oracle processes, however, since they all include the SGA (system global area). You will have to deduct the size of that from the values in RES. It can be that once you have deducted that, the process using most RES will be further down the list, so watch out for that one....
I had a theory that it might be LVM2 slowing me down. So I cleared a 250GB (61 MB/s) drive. I'm in the process of re-balancing things to the new config. (This way the DB is on it's own drive)
What I'm seeing is the OraASM_01-OraASM_03 are running combined reads at 25MB/s max. iostats is showing 100% drive utilization. Oracle is writing to the new 250GB at about 10MB/s but is only showing 16% utilization.
I'm guessing here that LVM is doing something which is effectively throttling the Oracle ASM logical volumes. If this is the case it explains a lot of what I've been seeing. It would explain why none of my stats make sense.
Does anyone know what could be causing the throttling effect of LVM?
Also thanks for the network commands I'll try them out.
Last edited by MikeyCarter; 12-23-2008 at 10:18 PM..
First the command I was looking for is iotop. However seems to only work with new kernels and a newer python.
Now to my problem. iostats has been screaming my problem at me and I was ignoring it because I didn't understand. It seems that my system/drives can only handle ~100 io requests per second. This is why I'm seeing my drive speed only working at .01 MB/s and 100% utilization. I thought it was LVM because it was doing large read/writes while re-balancing the disks before.
So my first option is to create 4 x 75G lvm partitions on the 4 hard drives. This will give me ~400 requests per second. Second option is to get Oracle to send less IO requests.
My new question is, what is limiting the io requests? Is it the control, drive, drivers, kernel? How do I find out?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.