LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 07-22-2012, 03:03 PM   #1
pisti
Member
 
Registered: Jul 2006
Location: Montréal, Canada
Distribution: Slackware
Posts: 246

Rep: Reputation: 29
how to mirror file systems and keep inodes with filename links consistent ?


dear Linux Gurus,

is there a way to mirror a given filesystem from one server to another one and keep inode numbers and associated hard-linked filenames identical on both systems ?

specifically, i started to use rsync with the --link-dest option which works great backing up server A to server B using XFS throughout on slackware 13.37 and kernel 2.6.37.6. now, i also have a server C with identical configuration like A and B. i would like to copy B to C but i don't want to loose the daily incremental snapshot information created on B about A. currently i use plain rsync for copying only the freshest snapshot from B to C because one cannot really rsync between partitions that are hardlinked allover the place.

we speak here about three 24disk 3ware RAID6 file systems each some 50TB large.

i was wondering if there is a 'server-grade' reliable method on the GNU market to copy a partition to another one and keeping all inode/linked filenames consistent ? i am not sure if is this technically feasible at all nowadays ?

thanks much in advance, bye, pisti
 
Old 07-23-2012, 12:45 PM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 20,241

Rep: Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836Reputation: 6836
I do not think you can mirror inodes, if you need to do that you need to save/dupe the whole filesystem. Otherwise rsync is able to copy hard links, but it will take really long time (to detect all of them). This is the option -H
 
Old 07-24-2012, 08:07 AM   #3
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,716

Rep: Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192
You need to clone the entire filesystem. Clonezilla and partclone can do that, as can the dd command. In all cases, the source filesystem needs to be unmounted (mounted read-only is OK). Using dd is simplest, but has the disadvantage of pointlessly copying the contents of the free space in the filesystem (not really an issue if the filesystem is, say, >90% full).

Regardless of which method you use, I recommend assigning a new UUID to the destination. UUID is, after all, intended to be unique. For ext2/3/4, you can do that with tune2fs. The destination partition needs to be at least as large as the source. If larger, you can resize the filesystem to fill the partition afterward.
 
Old 07-24-2012, 08:49 AM   #4
pisti
Member
 
Registered: Jul 2006
Location: Montréal, Canada
Distribution: Slackware
Posts: 246

Original Poster
Rep: Reputation: 29
how to mirror file systems and keep inodes with filename links consistent ?

dear rknichols and pan64,

thank you for your replies - this gives us a whole new list of options and also stuff to look into. you are right, in essence we deal here with a partition cloning exercise. the main issue will be that these hardware-RAID partitions are truly big (50TB), so, we can only make use of an efficient mechanism to do such a job :

- dd : will be too slow i guess.
- rsync -H : may work.
- Clonezilla and partclone : new to me, will see.

in any case, we need something of 'enterprise-grade' - i am sure that there are reliable command-line based tools used in industry and research - for mirroring XFS partitions without losing inode numbers and associated hardlinks. any other suggestions ?

thanks again, bye, pisti
 
Old 07-24-2012, 05:51 PM   #5
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,716

Rep: Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192
Using "rsync -H" will maintain the groups of hard-linked files, but will absolutely not preserve inode numbers. There is simply no mechanism in the kernel for requesting a particular inode number when creating a file. Clonezilla and partclone support XFS, but for that filesystem you should also look at xfs_copy, xfsdump, and xfsrestore.
 
Old 07-24-2012, 06:00 PM   #6
sharadchhetri
Member
 
Registered: Aug 2008
Location: INDIA
Distribution: Redhat,Debian,Suse,Windows
Posts: 179

Rep: Reputation: 23
Quote:
Originally Posted by rknichols View Post
Using "rsync -H" will maintain the groups of hard-linked files, but will absolutely not preserve inode numbers. There is simply no mechanism in the kernel for requesting a particular inode number when creating a file. Clonezilla and partclone support XFS, but for that filesystem you should also look at xfs_copy, xfsdump, and xfsrestore.
it will not preserve inode no. becuase inode no. is defined as per the data,file permission of file reside in filesystem.
inode no. varies because the file or dir save in sector of hard disk.


inodes do not contain file names, only file metadata.
Unix directories are lists of association structures, each of which contains one filename and one inode number.
The file system driver must search a directory looking for a particular filename and then convert the filename to the correct corresponding inode number.


for more info read about debugfs command, sector and block in filesystem

http://en.wikipedia.org/wiki/Inode

Last edited by sharadchhetri; 07-24-2012 at 06:02 PM.
 
Old 07-25-2012, 09:18 AM   #7
pisti
Member
 
Registered: Jul 2006
Location: Montréal, Canada
Distribution: Slackware
Posts: 246

Original Poster
Rep: Reputation: 29
thanks for the correction, i wrote that too sloppy before. yes, of course, we need to preserve the hard-link structure, while the underlying inode numbers are independent. i use XFS for many years on servers and workstations, so it's time to get familiar with its tools, right? like xfs_copy, xfsdump, and xfsrestore - great suggestions, thanks!

'last' question to the crowd, promised : in view of the amount of data to back up (between 12-50TBs, which by itself is already an rsync backup done with the --link-dest option) what may be the most time efficient way of doing so ? we do backups here at least once a day. to give an idea, what we start with : 30-45min backup time with plain rsync, 12h with the --link-dest option (though that may change with the new hardware and slackware v14) - so, where will we arrive at with this 3d level backup? longer than 24h...? any clue, a feel?

thanks again, bye, pisti
 
Old 07-25-2012, 02:16 PM   #8
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,716

Rep: Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192
What I think you're saying is that you have a large filesystem containing several generations of backups, each in its own directory tree and with identical files hard-linked among those trees. You now want to make a backup of that entire filesystem. Cloning the whole filesystem is going to be the most efficient way to do that, but note that there is no such thing as an "incremental cloning", so if in the future you want to update that cloned image you'll have to use a different method or else run the whole process over again.

I have no experience with XFS or its tools, but cloning will typically saturate the bandwidth of one of (a) the source drives (reading), (b) the target drives (writing), or (c) the communication link (if done over a network). If you just look at what is in the "Used" column when you run df on the source filesystem and calculate how much time would be required to transfer that much data plus perhaps 15% for overhead through the most limiting bandwidth, you'll be in the right ballpark.

There is something strange about those timing numbers you've given, though. There's just no way a "30-45min backup time with plain rsync" would inflate to 12h just because you added the "--link-dest" option. Perhaps that "30-45min" time was for an incremental update to an existing backup hierarchy while the "12h" time was for creating a new backup hierarchy with unchanged files hard-linked to the previous versions. Hard linking is used as a substitute for transferring the file contents, so it should always make the process faster, rather than slower.
 
Old 07-25-2012, 04:58 PM   #9
pisti
Member
 
Registered: Jul 2006
Location: Montréal, Canada
Distribution: Slackware
Posts: 246

Original Poster
Rep: Reputation: 29
how to mirror file systems and keep inodes with filename links consistent ?

i agree, something is strange with my timings but i attributed it to the 'old' XFS+kernel version i use with slack13.37 (xfs-1.1.1 , xfsdump-3.0.4 , kernel 2.6.37.6 , 64bit) - which i hoped, following some info i gathered on the web, that this will improve with the new slack14 version (xfs-1.1.2 and kernel 3.2.13). apparently will the XFS metadata overhead profit from this upgrade.


here some specs of the two triple systems 1 and 2, each with 3 servers A,B,C :

---------------------------
system 1 : XFS RAID5 , 3ware 9550sx-12 , 12x WD1002FBYS , Supermicro X6DHE-XG2 , RAM 4GB , slack13.37
system 2 : XFS RAID6 , 3ware 9750-24i4e , 24x WD2003FYYS , Supermicro X8DAH+-F, RAM 48GB , slack14 pending

---------------------------
backup schedule (30 days incremental for A->B, simple mirroring for B->C, all invoked by server B) :

server A : 143464 directories , 9339293 files , 11TB

server B : OLD backup of A->B : rsync -axv --delete --backup --backup-dir=.. : 30-60min, depending on load, usually 45min
server B : NEW backup of A->B : rsync -axv --delete --link-dest : approx 12h (!) fairly consistently

server C : backup of B->C : rsync -axv --delete B C : some 30min
---------------------------


so, my hope is that filesystem 'cloning' B->C using an xfs*tool and newer hardware and OS will improve on those 12h.

on the other hand, i am afraid, that hard-linking those gazillions of files (and we will have many more soon!) takes its toll...metadata overhead ! A->B may take its burden, but then B->C even more ?! not to speak about deleting entire days from the incremental backup scheme ?

bye, pisti
 
Old 07-25-2012, 06:04 PM   #10
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,716

Rep: Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192Reputation: 2192
For the cloning operation, the presence of hard links should make no difference at all. The only decision the program needs to make is whether a given block is part of the filesystem's free space. If it's free space, then it is skipped. Otherwise, it's just copied as a binary blob to the same location in the destination partition.

The place where hard links can make a significant difference is when using rsync with the "-H" (--hard-links) option when there are a lot of different inodes with multiple links (as opposed to a few inodes, each with many links). That option forces rsync to keep an internal database of the source inode number and first-encountered pathname for every file with more than one hard link so that it can see what subsequent file(s) are linked to that same inode. Frankly, I wouldn't expect that to be a lot of overhead until the database got too big to hold in memory, but the rsync manpage describes that option as "expensive." Perhaps I'm overlooking something.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] recursive scp, no links across file systems lucmove Linux - Software 4 08-27-2009 08:27 AM
Read only file systems, custom live CDs, and embedded systems coffeecoffee Linux - Newbie 2 02-24-2009 11:09 PM
Change name of backup file in ext3 from filename~ to .filename~ Libu Linux - General 2 07-21-2008 09:29 PM
Inodes across the File Systems bhanu001 Linux - Newbie 5 02-22-2008 08:30 AM
file creation umask not consistent gostal Ubuntu 6 10-04-2006 02:09 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 07:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration