LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 08-19-2010, 08:11 PM   #1
Daemo
LQ Newbie
 
Registered: Feb 2009
Distribution: Fedora 12
Posts: 7

Rep: Reputation: 0
Question Why is JFS faster than a raw device for 1.5TB disk?


I have a 4 * 1.5TB RAID5 disk array (software linux RAID, formatted with jfs) on my Fedora 12 system and want to expand it by adding another 1.5TB disk.

I have added a drive to the system and conducted a simple performance check on it to make sure it was functioning properly:

Code:
# dd if=/tmp/bigfile.dat of=/dev/sdg1
5478774+1 records in
5478774+1 records out
2805132609 bytes (2.8 GB) copied, 168.77 s, 16.6 MB/s
But 16.6 MB/s is lousy. I ran an iostat -dmx 2 on this drive at the time of this lousy performance, and typical output was:
Code:
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               1.50     0.00  258.50    0.00    16.25     0.00   128.74     0.17    0.66   0.36   9.30
sdg               0.00  3556.00 4160.00   32.00    16.25    16.00    15.76   135.73   35.11   0.24 100.05
(note that sda and sdb are a linux raid mirror set for the / filesystem that holds /tmp). I formatted the new drive (/dev/sdg1) with jfs and mounted it under /mnt2:
Code:
# jfs_mkfs /dev/sdg1
jfs_mkfs version 1.1.13, 17-Jul-2008
Warning!  All data on device /dev/sdg1 will be lost!

Continue? (Y/N) y
   \

Format completed successfully.

1465138552 kilobytes total disk space.
# mount /dev/sdg1 /mnt2
and ran a similar test, this time to the filesystem:
Code:
# dd if=/tmp/bigfile.dat  of=/mnt2/bigfile.dat
5478774+1 records in
5478774+1 records out
2805132609 bytes (2.8 GB) copied, 25.6558 s, 109 MB/s
109 MB/s is awesome. An iostat -dmx 2 typically looked like this during this better performance:
Code:
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               9.00     0.00 1777.00    0.00   111.62     0.00   128.65     1.43    0.78   0.39  69.75
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdg               0.00   128.50    3.50  230.50     0.01   109.26   956.36    96.04  394.79   4.27 100.00
My question is this: if I add this new disk to the exisitng 4-disk RAID5 array, will it perform badly (around the 16.6 MB/s mark) or will it perform better (closer to the 109 MB/s mark)?

I would like to know what the performance will be like before I add the disk to the array because I don't want to wait for the whole array to be rebuilt before finding out my array is performing badly. The array is used as part of a mythtv system and has up to 6 simultaneous recordings running on it, so it needs to perform well.

I'm confused!

Thanks in advance for any help!

-Daemo
 
Old 08-19-2010, 08:44 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
Never believe numbers from a rerun unless you can absolutely eliminate cache effects.
echo "3" > /proc/sys/vm/drop_caches
Will generally give you some (better) idea of the numbers. Personally I reboot before every run.
 
1 members found this post helpful.
Old 08-19-2010, 09:05 PM   #3
Daemo
LQ Newbie
 
Registered: Feb 2009
Distribution: Fedora 12
Posts: 7

Original Poster
Rep: Reputation: 0
Re-runs

Thanks syg00

I re-ran the tests after flushing the caches each time as you recommended. This time I got 17 MB/s to the raw device partition, and 42 MB/s to the filesystem. Less severe but still raises the question.
 
Old 08-19-2010, 10:30 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
Give the first run a blocksize - 4096 would be a reasonable start. Increase as appropriate.

Update: that would be "obs" - you want to see the effect on the target disk. The filesystem will handle things appropriately on the input.

Last edited by syg00; 08-19-2010 at 11:31 PM.
 
Old 08-20-2010, 12:25 AM   #5
Daemo
LQ Newbie
 
Registered: Feb 2009
Distribution: Fedora 12
Posts: 7

Original Poster
Rep: Reputation: 0
Thanks very much syg00

Okay, now I have some consistency. After using dd with a 4096 byte blocksize, both the /dev/sdg1 raw partition and cooked jfs filesystem were similar: about 92-96 MB/s. I ran both of these tests several times, clearing the cache between tests.

Here's how it went:
bs=512... 16 MB/s
bs=1024... 17 MB/s
bs=2048... 19 MB/s
bs=4096... 94 MB/s
bs=8192... 99 MB/s
bs=16384... 98 MB/s
bs=32768... 99 MB/s

Note the jump at 4K blocks. My theory is that as this disk has native 4k sector sizes (and partitioning starts as sector 64) then this block size or multiples of this block size is most congruent with reads and writes.

Am I correct in saying: the raid array uses a 32K chunk size, so the md driver should read and write in 32K chunks and therefore this disk should perform optimally?
 
Old 08-20-2010, 01:37 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
Optimally ?. I would never go that far - too much software in between. The aims of the (various) authors is unlikely to directly correspond to what an end-user might want.

I would expect 32k is a reasonable read size - strace should confirm what the filesystem is trying to do at least. Getting that close to the block device layers (md and the real device layer below VFS) is probably way too difficult for any buy-back you might get.
 
Old 08-21-2010, 09:16 PM   #7
Daemo
LQ Newbie
 
Registered: Feb 2009
Distribution: Fedora 12
Posts: 7

Original Poster
Rep: Reputation: 0
Talking Completed!

Update: I have added the disk to the array, and the reshape rate was around the 20 MB/s, which is about right (each chunk remap is 4 reads and 5 writes). Reshape started at about midnight friday morning and completed sometime this morning (sunday). While mythtv was recording, the reshape rate went down to about 1 - 1.5 MB/s as is usual.

After the array was reshaped and I expanded the jfs filesystem, I was getting about 90-99 MB/s write rates from my clear cache/dd test, which is excactly what I need.

Thanks syg00!
 
  


Reply

Tags
disk, jfs, performance, raw



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to partition 1.5TB disk? kebabbert Solaris / OpenSolaris 4 10-28-2009 03:40 AM
Difference between using Raw block device with O_DIRECT and Raw Character Device srithi Linux - Newbie 1 08-19-2009 10:52 AM
Performance of single server with 5TB disk space (5 disks): how bad will it be? rs1050 Linux - Server 3 11-26-2008 11:31 PM
udev, create raw disk device, how? thllgo Red Hat 2 01-23-2008 10:50 AM
Disk quotas on JFS? jisoo23 Linux - General 1 03-03-2005 03:29 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 01:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration