LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   XFS stripe width on top of LVM over hardware RAID (https://www.linuxquestions.org/questions/linux-software-2/xfs-stripe-width-on-top-of-lvm-over-hardware-raid-910118/)

thebeerbaron 10-25-2011 03:26 PM

XFS stripe width on top of LVM over hardware RAID
 
I have two hardware RAID 6 arrays concatenated via LVM. I have a single XFS file system on top of this. The RAID 6 arrays are 11 and 12 disks wide.

I'd like to provide XFS with a stripe width (sw) and stripe unit (su) at mount time for enhanced performance.

I know my stripe unit size (64k), but what stripe width do I provide? My best guess is 19 (9 data spindles from array0, 10 data spindles from array1), but that's just a guess.

Anyone have experience? OS is Centos 5. Thanks in advance...

_bsd 10-27-2011 05:33 AM

I'm not sure what you mean by "concatenated" via LVM, but I presume you mean enclosed in a single Volume Group.
Each RAID6 array is a single Physical Volume as far as the VG is concerned
Did you set a specific physical extent size when creating the VG?
Did you set the number of stripes when you created the LV? If so, how many? If not, then each RAID unit is now logically one strip. The underlying strips within the RAID are hidden by the VG.

Regardless, if you've created a VG and LV, then the unit is now a single logical unit and the default xfs create will suffice.

Optimizing the performance of RAIDs and LVM is very detailed and and specific to both usage and underlying hardware
eg: database indexes on 15,000 RPM SAS drives.

What kind of raid, motherboard/os or controller?
What parameters were used creating the RAID, the Volume Group and the Logical Volume?
What is the indended use of the volume?

The short answer, according the XFS FAQ is to let the xfs create figure things out. Only optimize when performance is/becomes an issue.

I had to duplicate my RAID so that I could rebuild differently, then use the initial as backup.
YMMV

thebeerbaron 10-27-2011 10:57 AM

Quote:

Originally Posted by _bsd (Post 4509659)
I'm not sure what you mean by "concatenated" via LVM, but I presume you mean enclosed in a single Volume Group.

Concatenated as opposed to LVM striping.

Quote:

Did you set a specific physical extent size when creating the VG?
No.

Quote:

Did you set the number of stripes when you created the LV? If so, how many?
No, I did not use LVM striping.


Quote:

If not, then each RAID unit is now logically one strip. The underlying strips within the RAID are hidden by the VG.
I agree with you to this point. My question has more to do with the effect that by specifying a stripe unit and stripe width when mounting an XFS volume, it's more likely that writes will be spread across multiple spindles instead of concentrated on fewer (or possibly one) spindle. If I had a single PV on top of a single RAID device, I would specify the stripe size and width that my array uses. XFS would ostensibly write to LVM with the correct stride size for the underlying device and the data would be distributed optimally across multiple spindles.

But this is more complicated: I have two PVs and they do not have the same stripe width. To optimize write distribution, I need to know whether LVM will be writing to one PV until it fills up, whether it chooses a PV to write to at random (on a per-write basis?), or whether it distributes data across PVs (even though they are not striped). In the first case, I can test with the two different stripe widths and figure out which PV I am writing to -right now-. Then when that PV fills up and LVM starts writing to the second PV, I can switch to the appropriate stripe width. In the second case, I will never know the optimal stripe width. In the third case, the stripe width might be related to the total width of the two RAID devices, or it might not.

Does this make sense?

The specific section of the XFS FAQ that I'm screwing around with is here.

_bsd 10-27-2011 02:41 PM

Quote:

Originally Posted by thebeerbaron (Post 4509903)
I agree with you to this point. My question has more to do with the effect that by specifying a stripe unit and stripe width when mounting an XFS volume, it's more likely that writes will be spread across multiple spindles instead of concentrated on fewer (or possibly one) spindle. If I had a single PV on top of a single RAID device, I would specify the stripe size and width that my array uses. XFS would ostensibly write to LVM with the correct stride size for the underlying device and the data would be distributed optimally across multiple spindles.

But this is more complicated: I have two PVs and they do not have the same stripe width. To optimize write distribution, I need to know whether LVM will be writing to one PV until it fills up, whether it chooses a PV to write to at random (on a per-write basis?), or whether it distributes data across PVs (even though they are not striped). In the first case, I can test with the two different stripe widths and figure out which PV I am writing to -right now-. Then when that PV fills up and LVM starts writing to the second PV, I can switch to the appropriate stripe width. In the second case, I will never know the optimal stripe width. In the third case, the stripe width might be related to the total width of the two RAID devices, or it might not.

Does this make sense?

The specific section of the XFS FAQ that I'm screwing around with is here.

Since you've created two PVs, when you created your VG you presumably did the following
pvcreate /dev/sda
pvcreate /dev/sdb
vgcreate vg /dev/sda /dev/sdb
lvcreate -n lv vg

you can really only have two stripes, one on each pv, but as you say, you created a non-striped lv

I don't believe those XFS parameters will make any difference (not speaking from authority, I have not contributed to XFS code)
Read the LVM HOW TO, from 8.1 thru to 8.4 Harware Raid, combined with the XFS FAQ section you mentioned.

Could you post a copy of the lvm metadata from /etc/lvm/backup? (you can trim out the unnessesary bits if you want to)


All times are GMT -5. The time now is 08:33 PM.