[SOLVED] how to mirror file systems and keep inodes with filename links consistent ?
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
how to mirror file systems and keep inodes with filename links consistent ?
dear Linux Gurus,
is there a way to mirror a given filesystem from one server to another one and keep inode numbers and associated hard-linked filenames identical on both systems ?
specifically, i started to use rsync with the --link-dest option which works great backing up server A to server B using XFS throughout on slackware 13.37 and kernel 2.6.37.6. now, i also have a server C with identical configuration like A and B. i would like to copy B to C but i don't want to loose the daily incremental snapshot information created on B about A. currently i use plain rsync for copying only the freshest snapshot from B to C because one cannot really rsync between partitions that are hardlinked allover the place.
we speak here about three 24disk 3ware RAID6 file systems each some 50TB large.
i was wondering if there is a 'server-grade' reliable method on the GNU market to copy a partition to another one and keeping all inode/linked filenames consistent ? i am not sure if is this technically feasible at all nowadays ?
I do not think you can mirror inodes, if you need to do that you need to save/dupe the whole filesystem. Otherwise rsync is able to copy hard links, but it will take really long time (to detect all of them). This is the option -H
You need to clone the entire filesystem. Clonezilla and partclone can do that, as can the dd command. In all cases, the source filesystem needs to be unmounted (mounted read-only is OK). Using dd is simplest, but has the disadvantage of pointlessly copying the contents of the free space in the filesystem (not really an issue if the filesystem is, say, >90% full).
Regardless of which method you use, I recommend assigning a new UUID to the destination. UUID is, after all, intended to be unique. For ext2/3/4, you can do that with tune2fs. The destination partition needs to be at least as large as the source. If larger, you can resize the filesystem to fill the partition afterward.
how to mirror file systems and keep inodes with filename links consistent ?
dear rknichols and pan64,
thank you for your replies - this gives us a whole new list of options and also stuff to look into. you are right, in essence we deal here with a partition cloning exercise. the main issue will be that these hardware-RAID partitions are truly big (50TB), so, we can only make use of an efficient mechanism to do such a job :
- dd : will be too slow i guess.
- rsync -H : may work.
- Clonezilla and partclone : new to me, will see.
in any case, we need something of 'enterprise-grade' - i am sure that there are reliable command-line based tools used in industry and research - for mirroring XFS partitions without losing inode numbers and associated hardlinks. any other suggestions ?
Using "rsync -H" will maintain the groups of hard-linked files, but will absolutely not preserve inode numbers. There is simply no mechanism in the kernel for requesting a particular inode number when creating a file. Clonezilla and partclone support XFS, but for that filesystem you should also look at xfs_copy, xfsdump, and xfsrestore.
Using "rsync -H" will maintain the groups of hard-linked files, but will absolutely not preserve inode numbers. There is simply no mechanism in the kernel for requesting a particular inode number when creating a file. Clonezilla and partclone support XFS, but for that filesystem you should also look at xfs_copy, xfsdump, and xfsrestore.
it will not preserve inode no. becuase inode no. is defined as per the data,file permission of file reside in filesystem.
inode no. varies because the file or dir save in sector of hard disk.
inodes do not contain file names, only file metadata.
Unix directories are lists of association structures, each of which contains one filename and one inode number.
The file system driver must search a directory looking for a particular filename and then convert the filename to the correct corresponding inode number.
for more info read about debugfs command, sector and block in filesystem
thanks for the correction, i wrote that too sloppy before. yes, of course, we need to preserve the hard-link structure, while the underlying inode numbers are independent. i use XFS for many years on servers and workstations, so it's time to get familiar with its tools, right? like xfs_copy, xfsdump, and xfsrestore - great suggestions, thanks!
'last' question to the crowd, promised : in view of the amount of data to back up (between 12-50TBs, which by itself is already an rsync backup done with the --link-dest option) what may be the most time efficient way of doing so ? we do backups here at least once a day. to give an idea, what we start with : 30-45min backup time with plain rsync, 12h with the --link-dest option (though that may change with the new hardware and slackware v14) - so, where will we arrive at with this 3d level backup? longer than 24h...? any clue, a feel?
What I think you're saying is that you have a large filesystem containing several generations of backups, each in its own directory tree and with identical files hard-linked among those trees. You now want to make a backup of that entire filesystem. Cloning the whole filesystem is going to be the most efficient way to do that, but note that there is no such thing as an "incremental cloning", so if in the future you want to update that cloned image you'll have to use a different method or else run the whole process over again.
I have no experience with XFS or its tools, but cloning will typically saturate the bandwidth of one of (a) the source drives (reading), (b) the target drives (writing), or (c) the communication link (if done over a network). If you just look at what is in the "Used" column when you run df on the source filesystem and calculate how much time would be required to transfer that much data plus perhaps 15% for overhead through the most limiting bandwidth, you'll be in the right ballpark.
There is something strange about those timing numbers you've given, though. There's just no way a "30-45min backup time with plain rsync" would inflate to 12h just because you added the "--link-dest" option. Perhaps that "30-45min" time was for an incremental update to an existing backup hierarchy while the "12h" time was for creating a new backup hierarchy with unchanged files hard-linked to the previous versions. Hard linking is used as a substitute for transferring the file contents, so it should always make the process faster, rather than slower.
how to mirror file systems and keep inodes with filename links consistent ?
i agree, something is strange with my timings but i attributed it to the 'old' XFS+kernel version i use with slack13.37 (xfs-1.1.1 , xfsdump-3.0.4 , kernel 2.6.37.6 , 64bit) - which i hoped, following some info i gathered on the web, that this will improve with the new slack14 version (xfs-1.1.2 and kernel 3.2.13). apparently will the XFS metadata overhead profit from this upgrade.
here some specs of the two triple systems 1 and 2, each with 3 servers A,B,C :
---------------------------
backup schedule (30 days incremental for A->B, simple mirroring for B->C, all invoked by server B) :
server A : 143464 directories , 9339293 files , 11TB
server B : OLD backup of A->B : rsync -axv --delete --backup --backup-dir=.. : 30-60min, depending on load, usually 45min
server B : NEW backup of A->B : rsync -axv --delete --link-dest : approx 12h (!) fairly consistently
server C : backup of B->C : rsync -axv --delete B C : some 30min
---------------------------
so, my hope is that filesystem 'cloning' B->C using an xfs*tool and newer hardware and OS will improve on those 12h.
on the other hand, i am afraid, that hard-linking those gazillions of files (and we will have many more soon!) takes its toll...metadata overhead ! A->B may take its burden, but then B->C even more ?! not to speak about deleting entire days from the incremental backup scheme ?
For the cloning operation, the presence of hard links should make no difference at all. The only decision the program needs to make is whether a given block is part of the filesystem's free space. If it's free space, then it is skipped. Otherwise, it's just copied as a binary blob to the same location in the destination partition.
The place where hard links can make a significant difference is when using rsync with the "-H" (--hard-links) option when there are a lot of different inodes with multiple links (as opposed to a few inodes, each with many links). That option forces rsync to keep an internal database of the source inode number and first-encountered pathname for every file with more than one hard link so that it can see what subsequent file(s) are linked to that same inode. Frankly, I wouldn't expect that to be a lot of overhead until the database got too big to hold in memory, but the rsync manpage describes that option as "expensive." Perhaps I'm overlooking something.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.