LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   How do I identify what process are making up the WA statistic? (http://www.linuxquestions.org/questions/linux-software-2/how-do-i-identify-what-process-are-making-up-the-wa-statistic-692545/)

MikeyCarter 12-23-2008 01:54 PM

How do I identify what process are making up the WA statistic?
 
I have a WA% between 50-80%.

So I have a few questions.

1. Is that WA% only disk I/O or can it include network IO?
2. How do I identify which processes are making up that WA statistic?
3. Why does iostat report 100% utilization when the drive read/write speeds are under 1MB/s?

The reason I'm asking.

My Oracle stats are saying that the drives are running at about 5MB/s right now, the WA is 55%, and the iostat -m -x is reporting the drives running at 100% utilization with a read/write rate under 1MB/s. The read speed of the drive is 98MB/s (tested with hdparm -tT when I installed the drives)

I know I have a few queries using full tables scans which I'm working on but I'd think that 100% utilization should be running at the full 98MB/s.

I want to try to identify which process (most likely and oracle one) is using the 100% utilization/55% WA.

thecarpy 12-23-2008 04:15 PM

WA% is IO, it can be disk and/or network IO. Anything the CPU is waiting on can cause increased WA%.

Most of time, from what I have experienced, it occurs with swapping. Since the drive is apparently sort of idle, you should look at the network, however.

Code:

tcpdump -i eth0
will show you the network activity, look for ever re-occurring patterns.
Code:

lsof -i
might help you find out which processes are using the network.
Code:

netstat -i TCP
is also pretty good, if you know what is doing what with the network on your computer.

thecarpy 12-23-2008 04:32 PM

Sorry, I forgot, it may not be network related. Have you got ext3? How much free space do you have on the drives? The ext3 fs is very bad in performance once the drives start to fill up.

Do you not have the possibility to create indexes in the Oracle database, full table scans are expensive and should be avoided at all costs, especially if the table is large...

How much swap is used?

You should see that in
Code:

top
,
Code:

swapon -s
or
Code:

free
?

Which process is using most swap:

in top, press O and the o, this will order processes by virtual memory usage.


Oh, damn, I forgot the most important one here, vmstat. read its man page, no too long and very informative!

thecarpy 12-23-2008 05:06 PM

To see which applications use how much effective virtual memory, you look at the RES column in top.

You should watch out with Oracle processes, however, since they all include the SGA (system global area). You will have to deduct the size of that from the values in RES. It can be that once you have deducted that, the process using most RES will be further down the list, so watch out for that one....

Sorry for the many posts ...

MikeyCarter 12-23-2008 09:16 PM

I think I may have found the main culprit.


my setup is: two 98MB/s and two 61MB/s drives.
Code:

+==================+    +==================+
|  Dom0 - F8      |    |  DomU - OEL    |
|                  |    |                  |
|  VG- OraASM_01  |-----|  Raw: ASM Part 1 |
|  VG- OraASM_02  |    |  Raw: ASM Part 2 |
|  VG- OraASM_03  |    |  Raw: ASM Part 3 |
|                  |    |                  |
+==================+    +==================+

I had a theory that it might be LVM2 slowing me down. So I cleared a 250GB (61 MB/s) drive. I'm in the process of re-balancing things to the new config. (This way the DB is on it's own drive)

What I'm seeing is the OraASM_01-OraASM_03 are running combined reads at 25MB/s max. iostats is showing 100% drive utilization. Oracle is writing to the new 250GB at about 10MB/s but is only showing 16% utilization.

I'm guessing here that LVM is doing something which is effectively throttling the Oracle ASM logical volumes. If this is the case it explains a lot of what I've been seeing. It would explain why none of my stats make sense.

Does anyone know what could be causing the throttling effect of LVM?

Also thanks for the network commands I'll try them out.

MikeyCarter 12-24-2008 06:11 AM

Ok I'm such a moron!

First the command I was looking for is iotop. However seems to only work with new kernels and a newer python.


Now to my problem. iostats has been screaming my problem at me and I was ignoring it because I didn't understand. It seems that my system/drives can only handle ~100 io requests per second. This is why I'm seeing my drive speed only working at .01 MB/s and 100% utilization. I thought it was LVM because it was doing large read/writes while re-balancing the disks before.

So my first option is to create 4 x 75G lvm partitions on the 4 hard drives. This will give me ~400 requests per second. Second option is to get Oracle to send less IO requests.

My new question is, what is limiting the io requests? Is it the control, drive, drivers, kernel? How do I find out?


All times are GMT -5. The time now is 05:11 AM.