-   Linux - Software (
-   -   Problems with Raid5 performance (

Orangutanklaus 01-16-2011 12:13 PM

[solved] Problems with Raid5 performance

I've built a new home server, based on Ubuntu Desktop 10.10 64bit. The Server is powered by an AMD X2 450e, 2GB DDR3/ECC, a 60GB 2.5" IDE system hard disk drive and three Samsung HD204UI 2TB hard disk drives for the Raid-Set. (4k sectors / Firmware is already updated)

Curious, but don't expect I have a write performance problem - the issue concerns to the read performance.

  • each drive got a partition (aligned - verified!)
  • file system is XFS / 128k chunks


DEVICE /dev/sd[bcd]1

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid5 num-devices=3 metadata=0.90 UUID=40e71bb8:4407168d:3e33e59e:db24ae34


Metadaten =/dev/md0              isize=256    agcount=32, agsize=30523616 blks
          =                      sectsz=4096  attr=2
Daten    =                      bsize=4096  Blöcke=976755712, imaxpct=5
          =                      sunit=32    swidth=64 blks
Benennung =Version 2              bsize=4096  ascii-ci=0
Protokoll =Intern                bsize=4096  Blöcke=476931, Version=2
          =                      sectsz=4096  sunit=1 blks, lazy-count=1
Echtzeit  =keine                  extsz=4096  Blöcke=0, rtextents=0

mount options:

/dev/md0 on /mnt/raid5 type xfs (rw,noatime)

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdd1[0] sdb1[2] sdc1[1]
      3907026688 blocks level 5, 128k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>

input benchmark (raid volume is already filled up to 2TB):

dd if=/dev/zero of=/mnt/raid5/bonnie/4GB bs=1M count=4000
4000+0 Datensätze ein
4000+0 Datensätze aus
4194304000 Bytes (4,2 GB) kopiert, 26,908 s, 156 MB/s

output benchmark:

time cp /mnt/raid5/bonnie/4GB /dev/null

real    1m3.519s  => ~65MB/s
user    0m0.040s
sys    0m7.320s

The benchmark results meet the results of smb transfers from and to my Windows 7 desktop PC. When I write to the server I can easily saturate the GbE connection with an avg. throughput of ~117MB/s (although the raid volume is already filled with ~2TB!). But when I transfer files form the server to my desktop I can't exceed ~80MB/s when I choose files even with files that are physically located on the outer sectors. At any given time the CPU is far away from fully utilization during the transfers. I've already played with several mount options but I can't get better results for the output.

Any suggestion is welcome!


Nominal Animal 01-16-2011 01:21 PM

Very interesting, but writing zeros to XFS is not a benchmark. And you need to use conv=fdatasync for dd to sync the data to the disk; otherwise it may just be cached.

To check the real life performance, change to the filesystem to be tested, then create a random file, say 4GB of it:

dd if=/dev/urandom of=4GB bs=1048576 count=4096
That will take some time. /dev/urandom is usually the bottleneck, so this is not a write speed test.

Then, check the read performance (in 256k chunks) via

sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
dd if=4GB of=/dev/null bs=262144

and copy performance (read + write on the same filesystem) via

sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
dd if=4GB of=4GB.copy bs=262144 conv=fdatasync

The sync and echo commands make sure your disk caches, including inode and dentry caches, are empty; this is a "cold cache" test, i.e. minimum performance.

A good write performance test would require you to put the random file, say 1GB of it, to a ramdisk:

sudo mkdir -m 0700 /tmp/ramdisk
sudo mount -t tmpfs -o rw,noexec,nosuid,nodev,size=1100M,mode=01777 none /tmp/ramdisk/
dd if=4GB of=/tmp/ramdisk/1GB bs=1048576 count=1024

Then run the write-only test (still from your normal working directory, not at the ramdisk!):

sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
dd if=/tmp/ramdisk/1GB of=1GB.copy bs=262144 conv=fdatasync

To tear down the ramdisk, run

sudo umount /tmp/ramdisk/
sudo rmdir /tmp/ramdisk

I'd be very interested to see the real-life speeds from the dd commands report.
Nominal Animal

Nominal Animal 01-16-2011 01:50 PM

For comparison, here are my results on software-RAID0 over two Samsung F1 1TB (HD103UJ) drives, on a 94% full ext4 filesystem, run just now using the exact above commands.

My leftover extents are all at the very end of the disks, so I'm getting the very worst performance possible out of any new files.
Other than normal desktop usage, the drives were idle. Kernel is vanilla + autogroup patches.

If you run the copy or write tests repeatedly, remember to remove the output file from the old run first.

Read performance using 256k chunks:

4294967296 bytes (4.3 GB) copied, 35.9747 s, 119 MB/s
When it was pristine and empty, I got 230 MB/s.

Copy performance using 256k chunks:

4294967296 bytes (4.3 GB) copied, 105.057 s, 40.9 MB/s
That's really bad.

Write performance using 256k chunks, reading from a ramdisk:

1073741824 bytes (1.1 GB) copied, 11.6993 s, 91.8 MB/s
About what I'd expect when compared to the read speed.

It is possible there is some kind of a RAID0 performance regression in the kernel; I'd have to run the tests on an older kernel to check.
Nominal Animal

Orangutanklaus 01-16-2011 02:14 PM

Thanks for your reply. I'm going to perform it tomorrow and post the results.

Orangutanklaus 01-17-2011 02:18 PM

Here's the output for further analysis (just made C&P; I'm too tired for more today):

tpm@ubuntu-amd64:~$ ls /proc/sys/vm/drop_caches
tpm@ubuntu-amd64:~$ cat /proc/sys/vm/drop_caches
tpm@ubuntu-amd64:~$ sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
tpm@ubuntu-amd64:~$ sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
tpm@ubuntu-amd64:~$ sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
tpm@ubuntu-amd64:~$ dd if=/mnt/raid5/bonnie/4GB of=/dev/null bs=262144
16384+0 Datensätze ein
16384+0 Datensätze aus
4294967296 Bytes (4,3 GB) kopiert, 54,5493 s, 78,7 MB/s
tpm@ubuntu-amd64:~$ sudo mkdir -m 0700 /tmp/ramdisk
tpm@ubuntu-amd64:~$ sudo mount -t tmpfs -o rw,noexec,nosuid,nodev,size=1100M,mode=01777 none /tmp/ramdisk/
tpm@ubuntu-amd64:~$ dd if=/mnt/raid5/bonnie/4GB of=/tmp/ramdisk/1GB bs=1048576 count=1024
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 Bytes (1,1 GB) kopiert, 15,2155 s, 70,6 MB/s
tpm@ubuntu-amd64:~$ sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
tpm@ubuntu-amd64:~$ sudo sh -c 'sync ; echo 3 > /proc/sys/vm/drop_caches'
tpm@ubuntu-amd64:~$ dd if=/tmp/ramdisk/1GB of=1GB.copy bs=262144 conv=fdatasync
4096+0 Datensätze ein
4096+0 Datensätze aus
1073741824 Bytes (1,1 GB) kopiert, 35,6317 s, 30,1 MB/s

Hm... after all I still don't know the reason for the poor output performance.

Orangutanklaus 01-19-2011 07:38 AM

In Addition: When I perform a read speed test through the disk manager which comes with Gnome I get an avg. read speed of 200MB/s!

Orangutanklaus 01-29-2011 06:34 AM

The problem still exists and I can't figure out the reason. I would be pleased for every suggestion.

PS: I forgot to tell that I've switched to ext4...with nearly the same results.

Orangutanklaus 03-13-2011 04:10 AM

It seems that the mdadm version that comes with ubuntu 10.10 has a negative impact of the partition alignment. After I compiled version 3.1.4 I've got the results below:

Info: option -b, 20GB Volume @ Raid5 @ 3x2TB HDDs @ 1TB Offset


Version  1.96      ------Sequential Output------ --Sequential Input- --Random-
Concurrency  1    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
ubuntu-amd64  7416M  1502  97 112349  18 42032  14  4099  96 127247  21 152.2  8
Latency              9843us    1146ms  90032us  35988us  57843us    2967ms
Version  1.96      ------Sequential Create------ --------Random Create--------
ubuntu-amd64        -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                16    75  2 +++++ +++    74  2    69  2 +++++ +++    65  1
Latency              177ms    247us    1490ms    1796ms      40us    2202ms

All times are GMT -5. The time now is 11:00 AM.