Poor RAID-0 performance. mdadm, SATA, CentOS 5
We have been using a software RAID setup for years as a cheap way to get higher disk IO during some of our experiments. The old setup used two 140gb SCSI disks attached to an Adaptec Ultra160 card. With this setup we achieved a constant write speed of about 40Mbs.
We have a new 8 core Tyan (s5393) machine and are attempting to create a new software RAID-0 setup with three 1T Seagate (model ST31000340AS) drives. On paper, the Seagate drives have more cache and are capable of much higher sustained IO speeds. However we have found this setup to be very unreliable. We would like to able to write at a constant 120Mbs, and sometimes this setup achieves this. But other times it seems to choke and hiccup, leaving us with sometime 100Mbs or less. We sometime can't get a constant 40Mbs. It's very inconsistent unlike the old SCSI setup. No errors in the system log... And we are careful to monitor the CPUs, and 16gm of RAM.
What could we be running into?
Whats a good way to test my software RAID's disk IO? I'm using a tool called fio now.
Are there any tweaks I can make to the RAID setup to improve speed? We are basically writing one large multi-gig file to the array.
What kind of performance should be expected?
Any other tips would be greatly appreciated.
UPDATE: Even with hardware RAID, performance is still suck. Lots of IO wait.
I would like to see some more input from others on this topic.
I too have seen a massive difference in performance while switching from CentOS 4 to CentOS 5.
Have tried changing hardware RAID controller from an IBM ServeRAID 8k to a MegaRAID SAS but this made no differnce.
Given the very same work load, CentOS 4 seems to be much more consistent and efficient in disk writes.
We are testing by transferring 4.5GB of data across the network (Gigabit Ethernet) to the disk. I have also tested switching from onboard Broadcom NIC to an Intel DualPort PCIe Gigabit adapter - this made no difference even with the Intel DMA accelerator installed.
Running sar during the transfer to monitor bandwidth 1min load average and average IO wait percentages.
Tests done using IBM x3500 server 2 x QuadCore 3.0GHz Xeon CPU, 32GB RAM, 8 x 300GB SAS HD in RAID 10 configuration.
Test 1 - CentOS 4.6 i386 - kernel 2.6.9-67.0.22
Test 2 - CentOS 5.3 i386 - kernel 2.6.18-128.1.10.el5PAE
This may not seem like much at first, but we develop medical practice management and electronic medical records software. If I install a CentOS 5.3 loaded server at one of our big clients, I could have 150 people logged in using the system at once. This difference in load grows out of control when the system gets heavily loaded.
Have tested using 30-day eval copy of RedHat 5.4 and get the same results. Running multiple tests I find that while CentOS 4 is extremely consistent in it's performance, CentOS 5 test results will jump around like crazy. One run of the above test will give a 0.95 LoadAvg result and the next run will give a 2.26 for the very same test! I run multiple tests ans average them for the above numbers.
Seems to be something low-level is happening here as changing I/O subsystem completely makes no difference in the performance.
I certainly agree with spankbot that IOwait times in CentOS5/RedHat5 are ridiculous. Anyone have some input on this??
|All times are GMT -5. The time now is 09:35 PM.|