Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
1. I am copying a huge dir with lots of files, how do I see the overall progress? I put in the progress flag but now it show the progress each individual files.
2. Is rsync slower than cp? I used CP to copy over a NFS mount and that took about an hour. I am using rsync to copy over the same mount over the same NFS and so far it has taken 10 minutes to copy 7G. I got over 150G of stuff.
And, now you are tailing the output of what is being copied. If you want to see the filesystem size-change progress, you can do a:
Code:
~ # watch -n2 "du -sh /destination/directory/"
in another terminal to watch the size changes (every two seconds in the above example.)
And, yes, rsync is slower than cp, but has a LOT more functionality and safety built in. Generally, if you are starting with a blank destination, it's better to use cp, but if you are trying to get two filesystems in sync, it's better to use rsync. But, removing the "--progress" from your command should give you an immediate performance increase on the rsync copy times.
HTH. Let us know.
Last edited by ShadowCat8; 04-15-2015 at 08:20 PM.
Reason: Added helpful suggestion to OPs command.
rsync has some features which make it suited to repetitive backups in a way in which cp is not. One of those features is that, with the right switches, it does not overwrite an existing file in the target directory if 1)that file exists and 2) is unchanged in the target directory. In other words, in subsequent usages, it backs up only new or altered files.
Consequently, the first time rsync is used, it may take quite a while, but subsequent usages with the same source and target can be much, much quicker.
I'm by no means an rsync expert--I figured out the formulation that worked for me (rsync -rav) and have used it since. The Arch wiki, though, is an excellent reference.
1. I am copying a huge dir with lots of files[;] how do I see the overall progress? I put in the progress flag but now it [shows] the progress [of] each individual [file].
That's normal. There is no way to tell how much data, nor how much time an rsync will take. Given that, what would you like a progress bar to say?
Quote:
Originally Posted by jzoudavy
2. Is rsync slower than cp? I used CP to copy over [an] NFS mount, and that took about an hour. I am using rsync to copy over the same mount over the same NFS and so far it has taken 10 minutes to copy 7G. I [have] over 150G of stuff.
So, we should expect 200 minutes, by your observations so far. That's if it's linear.
Yes, Rsync is slower at being cp than cp. CP, though, is not very good at being a network-efficient re-startable synchronizer for data, so use the best tool for the job.
You will find rsync needs a lot of time at the start of a synchronization run as it calculates the work. CP doesn't care what's there already, and will appear faster for less work. Copy files over an unreliable link, though, and the benefit in rsync will become obvious quickly.
do you know what switch allows for not overwriting? is it the -u?
Quote:
Originally Posted by frankbell
rsync has some features which make it suited to repetitive backups in a way in which cp is not. One of those features is that, with the right switches, it does not overwrite an existing file in the target directory if 1)that file exists and 2) is unchanged in the target directory. In other words, in subsequent usages, it backs up only new or altered files.
Consequently, the first time rsync is used, it may take quite a while, but subsequent usages with the same source and target can be much, much quicker.
I'm by no means an rsync expert--I figured out the formulation that worked for me (rsync -rav) and have used it since. The Arch wiki, though, is an excellent reference.
It's been a while since I did my research, but, if I recall correctly, it's "-a" for "archive." "-r" means "recursive" (include subdirectories) and "-v" means "verbose."
From the man page:
Code:
-a, --archive archive mode; equals -rlptgoD (no -H,-A,-X)
If I recall incorrectly, I'm sure someone will correct me.
As an aside, rsync is a very versatile command, which means it and its permutations can also be quite complex. A web search for "rsync examples" can turn up many useful links that will be much more helpful than the man page.
Please note that -r and -a are redundant because -a includes -r. You can see that in the documentation frankbell referenced where -a basically means -rlptgoD. It doesn't hurt to include it. I'm only mentioning it's not necessary when using -a.
Code:
rsync -av src dst
Is my go-to command as well, like frankbell mentioned. That mode of operation already only copies over changes without copying it over if it exists and the checksum is the same.
When I started looking into rsync, I found a collection of the most confusing tech webpages I have ever seen. I finally found something that worked and stuck with it. I'll try refining it in the light of your advice next time I back up.
The man page wasn't much help, because it lacks examples; lack of examples is the Achilles heel of the man format.
cp also has the nasty problem of messing with timestamps, messing with ownerships, etc.
It also has the nasty habit of not handling symbolic links for files cleanly, and will turn a symbolic link into a full file copy.
Rsync handles all these things beautifully.
About the only think that I can think of that it doesn't handle well is device files, the only thing that I know of that handles this correctly is cpio.
Unknown to me is how well rsync handles hard links. I've not tested this, you could end up with multiple files instead of a file by inode, and multiple file names to that inode, as per normal. I've not tested/observed this closely.
About the only think that I can think of that it doesn't handle well is device files, the only thing that I know of that handles this correctly is cpio.
Unknown to me is how well rsync handles hard links. I've not tested this, you could end up with multiple files instead of a file by inode, and multiple file names to that inode, as per normal. I've not tested/observed this closely.
In what way does it not handle device files for you well? Perhaps, you're missing an option you could add to rsync. Regarding hardlinks.
Quote:
Originally Posted by rsync man page
Code:
Note that -a does not preserve hardlinks, because
finding multiply-linked files is expensive. You must separately specify -H.
...
-H, --hard-links
This tells rsync to look for hard-linked files in the transfer and link together the corresponding files on the receiving side. Without this option, hard-linked files in the transfer are treated as though they were separate files.
When you are updating a non-empty destination, this option only ensures
that files that are hard-linked together on the source are hard-linked together on the destination. It does NOT currently endeavor to break already existing hard links on the destination that do not exist between the source files. Note, however, that if one or more extra-linked files have content changes, they will become unlinked when updated (assuming you are not using the --inplace option).
Note that rsync can only detect hard links between files that are inside
the transfer set. If rsync updates a file that has extra hard-link connections to files outside the transfer, that linkage will be broken. If you are tempted to use the --inplace option to avoid this breakage, be very careful that you know how your files are being updated so that you are certain that no unintended changes happen due to lingering hard links (and see the --inplace option for more caveats).
If incremental recursion is active (see --recursive), rsync may transfer
a missing hard-linked file before it finds that another link for that contents exists elsewhere in the hierarchy. This does not affect the accuracy of the transfer, just its efficiency. One way to avoid this is to disable incremental recursion using the --no-inc-recursive option.
...
Also, regarding the -a option (aka -rlptgoD). You can read up each option individually to learn what -a does e.g. -r -l -p -t -g -o -D. There are options for handling special devices and the like.
do you know what switch allows for not overwriting? is it the -u?
Greetings,
Yes, what you need is to let rsync know what to do with the files you are about to replace with --backup and --suffix=<SUFFIX>
Quote:
Originally Posted by rsync man page
...
-b, --backup - make backups (see --suffix & --backup-dir)
--backup-dir=DIR - make backups into hierarchy based in DIR
--suffix=SUFFIX - backup suffix (default ~ w/o --backup-dir)
...
So, if you want to get 'dest' in sync to 'src' and back up all the files that will be replaced with the today's date as a suffix, you would do something like this:
2. Is rsync slower than cp? I used CP to copy over a NFS mount and that took about an hour. I am using rsync to copy over the same mount over the same NFS and so far it has taken 10 minutes to copy 7G. I got over 150G of stuff.
I just used rsync -avh --progress ...
Thanks
Davy
Hi Davy,
When you say copying the "same amount over," do you mean the *exact* same files? If it's not the exact same files then the time can't be used as a good benchmark between the two utilities (cp and rsync) because transfer times vary depending on file size and amount of files.
Also, using rsync copies files at 12MB/s (7GB in 10 minutes) vs 42MB/s using cp (150GB in 60 minutes) according to your post. It's likely the difference is due to the number of files and the average file size in the dataset you're comparing.
In general, I prefer rsync because it guarantees the integrity of the copied file.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.