Iowait
Hi all,
below is the result from top on my server: 14:49:51 up 39 days, 17 min, 3 users, load average: 1.34, 1.27, 1.63 326 processes: 325 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 2.8% 0.2% 0.9% 0.1% 0.6% 51.6% 43.5% cpu00 5.4% 0.4% 1.4% 0.4% 1.0% 61.0% 30.3% cpu01 4.4% 0.0% 1.4% 0.0% 0.8% 61.6% 31.6% cpu02 1.2% 0.4% 0.6% 0.0% 0.6% 41.6% 55.6% cpu03 0.2% 0.0% 0.4% 0.0% 0.2% 42.6% 56.6% as you see, iowait is is about 50%, what does it tell you??? I noticed that the server running bit slow and I am new to Linux. thanks, |
It tells us that you forgot to post the next ten lines of your
tops output, and that you failed to give any detail about the machines physical details, e.g. HDD subsystem, network cards, ... Cheers, Tink |
Sorry Tink
16:00:21 up 39 days, 1:27, 5 users, load average: 9.10, 8.74, 8.16 366 processes: 364 sleeping, 2 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 16.6% 0.0% 3.3% 0.0% 0.0% 79.7% 0.0% cpu00 15.4% 0.2% 3.2% 0.0% 0.0% 81.2% 0.0% cpu01 17.1% 0.0% 3.3% 0.3% 0.1% 78.8% 0.0% cpu02 13.9% 0.0% 3.7% 0.0% 0.0% 82.2% 0.0% cpu03 19.9% 0.1% 2.9% 0.0% 0.1% 76.6% 0.0% Mem: 4091528k av, 4072000k used, 19528k free, 0k shrd, 23380k buff 2835480k actv, 744224k in_d, 76576k in_c Swap: 3068372k av, 698784k used, 2369588k free 3297556k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 13377 oracle 16 0 51640 49M 48368 S 4.5 1.2 0:01 2 oracle 13399 oracle 16 0 89196 85M 82816 S 3.8 2.1 0:02 2 oracle 13394 oracle 15 0 84820 81M 80488 D 3.7 2.0 0:01 0 oracle 12885 oracle 15 0 145M 123M 98868 D 1.4 3.0 1:56 1 oracle 13413 oracle 23 0 26888 24M 20604 R 0.3 0.6 0:00 3 oracle 13388 oracle 15 0 27940 23M 22684 S 0.2 0.5 0:00 1 oracle 11 root 15 0 0 0 0 SW 0.1 0.0 38:27 3 kswapd 5024 oracle 25 10 10556 8312 2784 S N 0.1 0.2 363:18 3 rhn-applet-gui 13261 oracle 16 0 1408 1408 904 R 0.1 0.0 0:02 0 top 13403 oracle 23 0 10872 10M 9300 S 0.1 0.2 0:00 2 oracle 13405 oracle 23 0 10904 10M 9332 S 0.1 0.2 0:00 0 oracle 13409 oracle 23 0 10936 10M 9364 S 0.1 0.2 0:00 2 oracle 13411 oracle 23 0 10880 10M 9308 S 0.1 0.2 0:00 0 oracle 1 root 15 0 300 264 236 S 0.0 0.0 0:36 0 init 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0 3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1 4 root RT 0 0 0 0 SW 0.0 0.0 0:00 2 migration/2 5 root RT 0 0 0 0 SW 0.0 0.0 0:00 3 migration/3 6 root 15 0 0 0 0 SW 0.0 0.0 0:00 2 keventd 7 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0 8 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1 9 root 34 19 0 0 0 SWN 0.0 0.0 0:00 2 ksoftirqd/2 10 root 34 19 0 0 0 SWN 0.0 0.0 0:00 3 ksoftirqd/3 |
And the disk-subsystem on the machine is ... ?
Who or what is it talking to in terms of networking, are you running an (or several) app-servers against it? More than one network card? What speeds? Cheers, Tink |
I don't have any app running on this server, I just have 15 databases running and couple users are connecting to them concurrently, this machine is fairly new with dual Xeon processor, the IOWAIT are at peak almost al the time so I am trying to find out what is happening.
Please throw out some advises |
And the disk-subsystem is?
And what Oracle version are you running? How do the users connect, running plain old sqlplus against it? And why do you always only ever answer one of my questions ignoring the rest? How am I supposed to help when you don't answer the others? :} Cheers, Tink |
Sorry, I didn't mean to ignore your question, I just don't understand what you mean since I am very sufficient with Linux.
And the disk-subsystem is? how do you find out about this And what Oracle version are you running? Oracle 10G release 2 How do the users connect, running plain old sqlplus against it? I don't know at this points. At this point, I am still looking for who/which datababse drag the system down If you don't ming to let me know how to narrow down and troubleshoot the problem, that would be greatly appreciated. Like right now is off business hours and the iostat still hight. |
Quote:
Have a look at the output of fdisk -l and we'll (try to) tackle it from there. Quote:
Hope they're not too big/busy. What kind of app is Oracle being used for? Have you configured opmn and emctl for those databases? Quote:
Quote:
Cheers, Tink |
Tink,
thanks so much for your advises and now I found out I don't have fdisk functionality neither iostat on my server so the next question is what package do I need to install so I can those two utilities loaded on my server. thanks a bunch |
The machine is a DeadRat box?
sysstat and util-linux |
thanks Tink,
I got fdisk and iostat installed. now the next question is "the gzip command really take a lot of resources???" Please see the below when I have gzip running: avg-cpu: %user %nice %sys %iowait %idle 21.00 0.00 1.13 60.30 17.57 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 182.05 1.60 5361.69 16 53456 sda1 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda3 182.05 1.60 5361.69 16 53456 sda4 0.00 0.00 0.00 0 0 sda5 0.00 0.00 0.00 0 0 sda6 0.00 0.00 0.00 0 0 sdb 350.95 9098.50 3792.98 90712 37816 sdb1 350.95 9098.50 3792.98 90712 37816 AFTER GZIP is completed: avg-cpu: %user %nice %sys %iowait %idle 0.18 0.00 0.18 9.05 90.60 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 33.30 3.21 516.75 32 5152 sda1 0.00 0.00 0.00 0 0 sda2 0.00 0.00 0.00 0 0 sda3 33.30 3.21 516.75 32 5152 sda4 0.00 0.00 0.00 0 0 sda5 0.00 0.00 0.00 0 0 sda6 0.00 0.00 0.00 0 0 sdb 1.40 1.60 28.08 16 280 sdb1 1.40 1.60 28.08 16 280 Is it normal for gzip to behave this way??? would you recommend to use some other utility to compress??? or you think I have I/O issue on my server??? thanks |
Could you please put code-tags around the stuff you copy & paste
to make it more readable? Zip should be fine, other compression methods would a) use more CPU or b) be inefficient at compression, there's just a lot of I/O going on. Can you give me details about the hardware we're looking at? Server type, CPU speed, controller type ... and again: if the box is a RedHat machine make sure you have turned auditing off, otherwise it will be happily playing with itself. The auditing is something you may want on a file-server, but definitely NOT on a database machine. And once again: it would be nice if you provided more detail in the first place, e.g which distro and version you're using ;) For future reference, and to make both your and our lives with helping you easier, read this, please Cheers, Tink |
Quote:
|
Quote:
Quote:
Cheers, Tink |
iowait is really just CPU idle time.
Switching to a faster algorithms wouldn't probably improve significantly the process, as this gzip operation seems I/O bound. sda3 and sdb1 are busy, looks fine to me, unless the I/Os are not on behalf the zip operation, but a consequence of memory shortage and pagination. |
tink,
thanks so much for your help and sorry for any inconvenience on my post that may have caused. when you mentioned about auditing. how do you check and see if you have auditing turn on on the server??? and if it's turn on, how can you turn it off??? thanks, |
Before you go to turn it off, check whether it's running in the first
place ;) Have a look in /var/log/audit.d .. if you got many (and, more importantly, recent) huge files in there, auditing is on... You still didn't tell us whether it's RedHat, and if it is, which version it is ... assuming that you're using something similar to us (AS3) for Oracle I'll give you these two commands: Code:
service audit stop Cheers, Tink |
I am only Linux Redhat 3.0 AS
thanks again and I think audit already off when I tried to shut it down and it failed and I also do the chkcconfig audit off. are there any other services or anything on OS side I can turn them off for the database server??? this is purely database server, nothing else on there. have a great weeekend TINK |
Please post the output of
fdisk -l and lspci -v here ... I *still* don't know anything about the hardware (except for the amount of physical RAM and that it's a machine with 4 or 2 CPUs (the latter if hyper-threading was enabled). Cheers, Tink |
for fdisk:
Code:
Disk /dev/sda: 146.6 GB, 146695782400 bytes 00:00.0 Host bridge: Intel Corporation E7520 Memory Controller Hub (rev 09) Subsystem: Dell: Unknown device 016d Flags: bus master, fast devsel, latency 0 Capabilities: [40] #09 [4105] 00:02.0 PCI bridge: Intel Corporation E7525/E7520/E7320 PCI Express Port A (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=01, subordinate=03, sec-latency=0 Memory behind bridge: dfc00000-dfefffff Prefetchable memory behind bridge: 00000000d8000000-00000000d8000000 Secondary status: SERR Capabilities: [50] Power Management version 2 Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Capabilities: [64] #10 [0041] 00:04.0 PCI bridge: Intel Corporation E7525/E7520 PCI Express Port B (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 Capabilities: [50] Power Management version 2 Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Capabilities: [64] #10 [0041] 00:05.0 PCI bridge: Intel Corporation E7520 PCI Express Port B1 (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=05, subordinate=07, sec-latency=0 I/O behind bridge: 0000d000-0000efff Memory behind bridge: df700000-dfbfffff Secondary status: SERR Capabilities: [50] Power Management version 2 Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Capabilities: [64] #10 [0041] 00:06.0 PCI bridge: Intel Corporation E7520 PCI Express Port C (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=08, subordinate=0a, sec-latency=0 Memory behind bridge: df600000-df6fffff Secondary status: SERR Capabilities: [50] Power Management version 2 Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Capabilities: [64] #10 [0041] 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02) (prog-if 00 [UHCI]) Subsystem: Dell: Unknown device 016d Flags: bus master, medium devsel, latency 0, IRQ 16 I/O ports at bce0 [size=32] 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02) (prog-if 00 [UHCI]) Subsystem: Dell: Unknown device 016d Flags: bus master, medium devsel, latency 0, IRQ 19 I/O ports at bcc0 [size=32] 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02) (prog-if 00 [UHCI]) Subsystem: Dell: Unknown device 016d Flags: bus master, medium devsel, latency 0, IRQ 18 I/O ports at bca0 [size=32] 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02) (prog-if 20 [EHCI]) Subsystem: Dell: Unknown device 016d Flags: bus master, medium devsel, latency 0, IRQ 23 Memory at dff00000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] #0a [20a0] 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=0b, subordinate=0b, sec-latency=32 I/O behind bridge: 0000c000-0000cfff Memory behind bridge: df400000-df5fffff Prefetchable memory behind bridge: d0000000-d7ffffff 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02) Flags: bus master, medium devsel, latency 0 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Dell: Unknown device 016d Flags: bus master, medium devsel, latency 0 I/O ports at <unassigned> I/O ports at <unassigned> I/O ports at <unassigned> I/O ports at <unassigned> I/O ports at fc00 [size=16] Memory at cffff000 (32-bit, non-prefetchable) [size=1K] 01:00.0 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor (rev 06) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=01, secondary=02, subordinate=02, sec-latency=64 Memory behind bridge: dfd00000-dfefffff Prefetchable memory behind bridge: 00000000d8000000-00000000d8000000 Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [6c] Power Management version 2 Capabilities: [d8] PCI-X non-bridge device. 01:00.2 PCI bridge: Intel Corporation 80332 [Dobson] I/O processor (rev 06) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=01, secondary=03, subordinate=03, sec-latency=64 Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [6c] Power Management version 2 Capabilities: [d8] PCI-X non-bridge device. 02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 4 (rev 06) Subsystem: Dell PowerEdge Expandable RAID Controller 4e/Di Flags: bus master, stepping, 66Mhz, medium devsel, latency 64, IRQ 38 Memory at d80f0000 (32-bit, prefetchable) [size=64K] Memory at dfdc0000 (32-bit, non-prefetchable) [size=256K] Expansion ROM at dfe00000 [disabled] [size=128K] Capabilities: [c0] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/1 Enable- Capabilities: [e0] PCI-X non-bridge device. 05:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=05, secondary=06, subordinate=06, sec-latency=32 I/O behind bridge: 0000e000-0000efff Memory behind bridge: dfa00000-dfbfffff Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [6c] Power Management version 2 Capabilities: [d8] PCI-X non-bridge device. 05:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=05, secondary=07, subordinate=07, sec-latency=32 I/O behind bridge: 0000d000-0000dfff Memory behind bridge: df800000-df9fffff Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [6c] Power Management version 2 Capabilities: [d8] PCI-X non-bridge device. 06:07.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05) Subsystem: Dell: Unknown device 016d Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 48 Memory at dfae0000 (32-bit, non-prefetchable) [size=128K] I/O ports at ecc0 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. 07:08.0 Ethernet controller: Intel Corporation 82541GI/PI Gigabit Ethernet Controller (rev 05) Subsystem: Dell: Unknown device 016d Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 49 Memory at df8e0000 (32-bit, non-prefetchable) [size=128K] I/O ports at dcc0 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. 08:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=08, secondary=09, subordinate=09, sec-latency=64 Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [6c] Power Management version 2 Capabilities: [d8] PCI-X non-bridge device. 08:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=08, secondary=0a, subordinate=0a, sec-latency=64 Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [6c] Power Management version 2 Capabilities: [d8] PCI-X non-bridge device. 0b:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] (prog-if 00 [VGA]) Subsystem: Dell: Unknown device 016d Flags: bus master, VGA palette snoop, stepping, medium devsel, latency 32, IRQ 18 Memory at d0000000 (32-bit, prefetchable) [size=128M] I/O ports at cc00 [size=256] Memory at df4f0000 (32-bit, non-prefetchable) [size=64K] Expansion ROM at <unassigned> [disabled] [size=128K] Capabilities: [50] Power Management version 2 about the hardware, it's Xeon processor, with 2 CPUs hyper so it likes 4 with 3Ghx processor. |
Cool ... now, what can you tell me about the HDDs?
They must both (sda and sdb) be hanging off the RAID controller? Are they on separate channels, or on the same? And for sdb at least I assume that you're using some RAID setup, can you give me details? One thing I found surprising in the output of lspci is that it says the Ethernet cards are unknown - which kernel are you running on the machine? Cheers, Tink |
thanks Tink,
sda is running on RAID 1 that where I put Linux software, Oracle home and Oracle files that frequently being used. sdb is running on RAID 5 that where all of my database datafiles. below is my kernel info: 2.4.21-32.0.1.ELsmp I am very new with Linux so I am affraid to mess with kernel so any advises will be greatly appreciated. |
Are they on the same SCSI channel? And how many disks is the
RAID5 comprised of? Another question I forgot to ask earlier: the gzip you were running in that test - which filesystem did you run it on, and how often do you use it? In terms of layout for Oracle: RAID-5 may not be the best solution in terms of performance, if money wasn't an issue I'd suggest RAID-10; and it would also make sense to have the data, the indices and the archive-logs on separate physical drives, specially if you have many busy databases. Cheers, Tink |
Another tuning aspect that may be quite important: what's
the data-block size you've chosen for Oracle, and which chunk size does the RAID use? Ideally you'd want them to be the same, I did a bit of reading up on your controller, and it appears that it has a default of 32K which may be a bit bigger than what your Oracle install uses. Also: how big is the cache on the controller? Cheers, Tink |
Hi TInk,
All of the thing you have mentioned about the database, it's already been implemented since I am aware of those things. as far as redo, controlfiles and temp and system are on RAID 1 only database files are on RAID 5 and samething for indexes since I don't have much options here. RAID 5 is the way company will go so I have to live with it. you ask: Ideally you'd want them to be the same, I did a bit of reading up on your controller, and it appears that it has a default of 32K which may be a bit bigger than what your Oracle install uses. Answer: sorry I don't know what are you looking for. Are they on the same SCSI channel? And how many disks is the RAID5 comprised of? Answer: yes and I think it had 7-8 disks |
Quote:
Quote:
have a look as to what size its smallest allocation unit is. If it's e.g. 4 times the size of the DBs block size that may slow things down quite significantly. Quote:
Looks like all disk I/O is going over one pipe. As for the number - not too shabby. Do you have a chance to run bonnie against the both sda and sdb at a time when the DB is idle? Cheers, Tink |
All times are GMT -5. The time now is 09:18 AM. |