LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Bad RAID5 performance - horrible when reading over the network (https://www.linuxquestions.org/questions/linux-server-73/bad-raid5-performance-horrible-when-reading-over-the-network-578004/)

exscape 08-18-2007 09:12 AM

Bad RAID5 performance - horrible when reading over the network
 
Hey everybody.
I've been running a RAID5 setup on my home server for almost a year now, and it's been working fine, albeit not that fast. The numbers I get when running local benchmarks seem decent (probably a bit lower than they should be, though), but networked performance is truly horrible at times!

I've narrowed the problem down to my RAID array - if I copy from a non-RAIDed partition, the transfer rate is OK (7-8MB/sec over a 100Mbps network), but from RAID I get an average of 2-4MB/sec, it keeps jumping between 0 and 7MB/sec.
The performance is not due to the network (I've run iperf), and as stated above, performance is better when not moving from the array.

I've used three protocols when transferring over the network - Samba, AFP (Apple File Protocol) and FTP, they all share the same problem so I doubt they're at fault. OK, down to the dirty details. The computer:

Athlon XP 1700+ on a VIA KT133A motherboard (previously a Duron 700 on a KT133 MB, same problem there)
512MB SDRAM
2x Western Digital WD2000JB 200GB drives (7200rpm, 8MB cache, on the motherboard controller)
1x Seagate Barracuda 7200.7 250GB SATA drive (7200rpm, 8MB cache, on a separate PCI controller card)

mdadm -D:
Quote:

/dev/md0:
Version : 00.90.03
Creation Time : Fri Nov 10 10:50:36 2006
Raid Level : raid5
Array Size : 390716672 (372.62 GiB 400.09 GB)
Used Dev Size : 195358336 (186.31 GiB 200.05 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Sat Aug 18 15:40:56 2007
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 128K

UUID : 4b7ea7ec:489680a4:4a1ed173:f23870ae
Events : 0.167238

Number Major Minor RaidDevice State
0 8 4 0 active sync /dev/sda4
1 3 1 1 active sync /dev/hda1
2 22 1 2 active sync /dev/hdc1
50GB from the 250GB drive is used for / and such.

Misc crap:
Quote:

exscape ~ # dmesg | grep -i ide
...
VP_IDE: IDE controller at PCI slot 0000:00:07.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci0000:00:07.1
ide0: BM-DMA at 0xc000-0xc007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xc008-0xc00f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
ide1 at 0x170-0x177,0x376 on irq 15

exscape ~ # dmesg | grep -i raid
raid6: int32x1 504 MB/s
raid6: int32x2 613 MB/s
raid6: int32x4 585 MB/s
raid6: int32x8 384 MB/s
raid6: mmxx1 1203 MB/s
raid6: mmxx2 1916 MB/s
raid6: sse1x1 1108 MB/s
raid6: sse1x2 1578 MB/s
raid6: using algorithm sse1x2 (1578 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: automatically using best checksumming function: pIII_sse
raid5: using function: pIII_sse (3366.000 MB/sec)
md: Autodetecting RAID arrays.
raid5: device sda4 operational as raid disk 0
raid5: device hdc1 operational as raid disk 2
raid5: device hda1 operational as raid disk 1
raid5: allocated 3162kB for md0
raid5: raid level 5 set md0 active with 3 out of 3 devices, algorithm 2
RAID5 conf printout:

exscape ~ # lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1a)
00:07.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1a)
00:07.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
00:08.0 RAID bus controller: Silicon Image, Inc. SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
00:09.0 Ethernet controller: Davicom Semiconductor, Inc. 21x4x DEC-Tulip compatible 10/100 Ethernet (rev 31)
00:0a.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:0c.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 400] (rev a1)

Some numbers, and such (taken from anoter computer, moving stuff from the server):
Copying from the array (1.5-8% CPU usage)
http://static.pici.se/pictures/sLrKXVBCH.png

Coping from outside the array (~15% CPU usage)
http://static.pici.se/pictures/uZGbClUbs.png

Quote:

Write performance
$ time dd if=/dev/zero of=/Array/test.tmp bs=1024k count=4000
4000+0 records in
4000+0 records out
4194304000 bytes (4.2 GB) copied, 160.479 s, 26.1 MB/s

real 2m40.548s
user 0m0.036s
sys 0m40.731s

Read performance
$ time dd if=/Array/test.tmp of=/dev/null bs=1024k count=4000
4000+0 records in
4000+0 records out
4194304000 bytes (4.2 GB) copied, 86.2431 s, 48.6 MB/s

real 1m26.264s
user 0m0.052s
sys 0m30.890s

--- Some reference numbers, my home folder is NOT on the RAID array, but on /dev/sda2

Write performance
$ time dd if=/dev/zero of=~/test.tmp bs=1024k count=4000
4000+0 records in
4000+0 records out
4194304000 bytes (4.2 GB) copied, 111.169 s, 37.7 MB/s

real 1m51.217s
user 0m0.016s
sys 0m35.586s

Read performance
$ time dd if=~/test.tmp of=/dev/null bs=1024k count=4000
4000+0 records in
4000+0 records out
4194304000 bytes (4.2 GB) copied, 75.5147 s, 55.5 MB/s

real 1m15.548s
user 0m0.040s
sys 0m29.190s
As you can see, both read performance (48.6MB/sec RAID, vs 55.5MB non-RAID) and write performance (26.1MB RAID, vs 37.7MB non-RAID) is better outside the RAID array.
However, the main problem is the slow networked performances (see images above).

Any help is very appreciated!
BTW, I'd rather not recreate the array or do anything else destructive, I don't have enough space to back everything up at the moment.

macemoneta 08-18-2007 09:32 AM

What I/O scheduler are you using?

cat /sys/block/hda/queue/scheduler
cat /sys/block/sda/queue/scheduler
cat /sys/block/hdc/queue/scheduler

If you are not using deadline, what happens when you switch it?

echo "deadline" > /sys/block/hda/queue/scheduler
echo "deadline" > /sys/block/sda/queue/scheduler
echo "deadline" > /sys/block/hdc/queue/scheduler

exscape 08-18-2007 09:46 AM

I take it the change is immediate? (cat:ing again shows that it changed)
Using the same parameters to dd as above, I get 25.1MB/sec write and 46.1MB/sec read, so slightly worse. FYI I was using the anticipatory scheduler before.

Thanks for the reply :)

Edit: Using CFQ, I get 24.8MB/s write and 50.5MB/sec read.

syg00 08-18-2007 10:46 AM

I have severe reservations about software RAID - if you want performance, go hardware RAID.
Regardless, there has to be conflicts between what the IO schedulers are trying to do by sorting (and "optimizing" - FSVO optimizing) the IO before handing off to the VFS driver, and what the RAID is actually doing by splitting up the IO again.

I'd be inclined to try NOOP as the IO scheduler - might compromise your non-RAID IO performance, but that is a cost you'd have to prioritize.
Might be worth trying as a test.

exscape 08-18-2007 11:09 AM

NOOP gave me better read performance (55.8MB/sec) but the same write (26.1MB/sec).

Still, this is not the main problem. 55MB/sec is more than enough for my purposes, the real problem is that it can't even keep up with a 100Mbps networking card when reading from the array!

macemoneta 08-18-2007 12:30 PM

What was the performance over the network when using the deadline scheduler (not local). That's where you are having an issue, right?

exscape 08-18-2007 12:57 PM

Using deadline:
5.8MB/sec from the array (three big files)
7.32MB/sec from /dev/sda2 (on many smaller files, MP3s)
The difference doesn't look that big, perhaps, but the graphs are way different (see my first post); when moving from the single disk there appears to be a steady stream of data, as opposed to the array that locks up and stops transmitting every now and then. I'm looking to get a network upgrade, in which case 6MB/sec would be way too low.

BTW, both were measured using size/time (800MB and 1GB) so they should be accurate, average numbers.

ajg 08-18-2007 12:59 PM

I smell a hardware rat with this one ... your RAID and non-RAID numbers look ok given that your system has additional workload doing RAID5 checksumming and stuff - you can expect some loss in performance. What's concerning me is not your average throughput from the RAID over the network, but the lack of consistency. You say it's jumping from 0MB/s to about 7MB/s. You've already said that 7MB/s is the sustained transfer rate from the non-RAID volume, and that looks awfully similar to the maximum rate you are getting from the RAID (and is a reasonable speed for fast ethernet).

I'm wondering if there's an intermittent problem with one of the members of the raidset, or possibly some kind of bus conflict going on between the drive controllers and the NIC - either IRQ sharing or some DMA problem. It could be worth moving your cards around in their slots and see if the problem goes away, or the symptoms change.

All this is of course assuming that your NIC, VIA ATA controller and Silicon Image SATA controller aren't all on-board!

exscape 08-18-2007 02:03 PM

I tried moving the cards around a bit (the LAN-side NIC and the SI SATA controller), no effect unfortunately.
I also tried changing a bunch of PCI/IDE related options (PCI Dynamic Bursting, PCI Delay Transaction, PCI #2 Access #1 Retry and IDE Prefetch Mode), all from disabled to enabled. No effects changing PnP OS from No to Yes either, I was hoping that might change some IRQs around.

exscape 08-18-2007 02:58 PM

Like a girlish band once sang... What's going on!?!
This makes ZERO sense to me!
OK. I tried using the other NICs for my LAN, that is, move the LAN cable to eth0 (from eth1) and set it up. OK, so I did that. Hmm. SSH seems slow... So I tried a file transfer. I got 587kbps!! Yes, that's right, five hundred eighty-seven kilobits per second. What the heck, that NIC downloads at ~800kB/sec (that's kilobytes) from the internet daily!
OK, so I tried the OTHER NIC (eth2), started a file transfer... WHOA, 18MB/sec! Over a 100Mbps NIC, this can't be! And it wasn't, either. I did the usual time-a-file and the average I got was... 3.2MB/sec. Both nload on my Linux box AND my laptop were both reporting the ultra-high speeds!

I'm going to keep looking for a cheap replacement box, while doubting my sanity...
Help is still appreciated though, I'm not giving up that easy!

ajg 08-18-2007 02:58 PM

Code:

cat /proc/interrupts
will confirm that nothing's sharing IRQs and should completely eliminate anything in that area.

I've never used the 3c509 with Linux, but it's a decent branded card and I wouldn't expect it to cause problems. The Silicon Image card is my favourite SATA controller. All my own servers and a good percentage of my customers' Linux servers use them, and I've not had any issues with it.

What I've never done is mix ATA and SATA drives in a single RAID set. Now config issues are eliminated, I would suspect there's something going on there, although diagnosing it with live drives is not going to be easy. It appears that something is getting deadlocked somewhere, but then releasing which could indicate kind of buffering issue. Are you running SMART monitoring? It works with ATA drives, but not SATA drives - could be something there. Reading through this thread, I'm somewhat clutching at straws.

ajg 08-18-2007 03:06 PM

That is very weird. Could be like some kind of PCI bus mastering thing.

What I would do, is make sure the SATA controller and the primary NIC that you need the performance through are in PCI bus mastering slots - if there are 5 slots on the board, there may only be a couple that support this.

You seem to be heading down the right path. Are all the NICs the same 3c509? If so, you could try a different brand of NIC, and see if this has any effect.

exscape 08-18-2007 04:05 PM

Hmm. eth0 and eth1 are sharing IRQ 5 as it stands right now.
Code:

$ cat /proc/interrupts
          CPU0     
  0:    145673    XT-PIC-XT        timer
  1:          8    XT-PIC-XT        i8042
  2:          0    XT-PIC-XT        cascade
  3:          0    XT-PIC-XT        acpi
  5:    106261    XT-PIC-XT        eth0, eth1
  7:          0    XT-PIC-XT        uhci_hcd:usb1, uhci_hcd:usb2
  8:          2    XT-PIC-XT        rtc
 10:      3947    XT-PIC-XT        sata_sil
 11:    110956    XT-PIC-XT        eth2
 12:          4    XT-PIC-XT        i8042
 14:        47    XT-PIC-XT        ide0
 15:        46    XT-PIC-XT        ide1
NMI:          0
ERR:          0

Two of the three NICs are 3c905b's. Currently, the non-3c905b is the used for the LAN, but the weird results (one uber-slow, one seemingly uber-fast) were both from the different 905 NICs! Very odd.

In any case, I mentioned looking for a computer to a friend who offered me an irresistable deal, so I'll simply get that and hope that the problems are gone (the HDDs will be the only thing the computers have in common). Don't get me wrong though, this thread isn't the only reason why I'm upgrading, as I said I'm not giving up this easily. :)

Also... Yes, I am in fact running smartmontools (using the -d ata option for the SATA drive, works great). I'm pretty certain the performance was this bad waay before I even installed that, but not 100%.

Edit: According to MSI "Five 32-bit Master PCI bus slots."
I'm guessing that means bus master.

ajg 08-19-2007 03:38 AM

Quote:

Originally Posted by exscape (Post 2863238)
Two of the three NICs are 3c905b's. Currently, the non-3c905b is the used for the LAN, but the weird results (one uber-slow, one seemingly uber-fast) were both from the different 905 NICs! Very odd.

Weirdness - it does look like there is some kind of deadlock condition happening between the NICs and one of the ATA controllers (I think we can discount the SATA controller as you've tested that separately with no problems). Could be something hogging the PCI bus, could be a buffering issue.

Quote:

Originally Posted by exscape (Post 2863238)
In any case, I mentioned looking for a computer to a friend who offered me an irresistable deal, so I'll simply get that and hope that the problems are gone (the HDDs will be the only thing the computers have in common). Don't get me wrong though, this thread isn't the only reason why I'm upgrading, as I said I'm not giving up this easily. :)

Yes - it's annoying when you can't quite figure what's happening. Never give up without a fight, but there comes a time where for the sake of your sanity, you just have to accept the weirdness, back up the data, flatten the box and start over. The new system will allow you to figure it out without the risk of losing your data.

Quote:

Originally Posted by exscape (Post 2863238)
Also... Yes, I am in fact running smartmontools (using the -d ata option for the SATA drive, works great). I'm pretty certain the performance was this bad waay before I even installed that, but not 100%.

You could try temporarily disabling it on startup, just to eliminate it as a suspect.

Quote:

Originally Posted by exscape (Post 2863238)
Edit: According to MSI "Five 32-bit Master PCI bus slots."
I'm guessing that means bus master.

Some of those old VIA KT133 boards were great! Really feature packed for the money! :cool:


All times are GMT -5. The time now is 03:49 AM.