[SOLVED] Problem with tar, ssh, and a directory with 1.5 million entries
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Problem with tar, ssh, and a directory with 1.5 million entries
I have been attempting to use this command to copy files from one machine to another over ssh:
ssh root@192.168.1.69 "cd /data1/docimg && tar -c -N'10-22-2010'" | tar -xv
Running against this directory I receive an error after about 90 seconds:
tar: This does not look like a tar archive
tar: Error exit delayed from previous errors
I then tried the command without the "| tar -xv" so I could see what was coming across. After a few seconds, it just returned with no kind of data being displayed at all.
I then ran the same original command but changed the directory from "docimg" to "reordered" which only has about 50 entries and this successfully copied all of the files based on the newer than restriction.
The "docimg" directory has about 1.5 million entries. I am wondering now if there is a maximum number of files in a directory for tar to work with. I have tried searching with +"tar" +"maximum files" and it seems to return every entry on the internet with a reference to a tar file and then some reference to "maximum files" and I didn't find anything useful. I know that the "cp" command as well as others have a limit on how many files they can process, but they usually return some sort of error, whereas tar is just not reporting anything.
If there is a file limit to using tar, is there some way to get around it or some other way to copy files that have been modified on or after a certain date from one server to another?
I have found the ssh and tar combination to be a lot faster than mounting samba or nfs and then using a find and copy combination.
I would like to test it on my server, but I have nothing to test with it. As far as your problem, I suggested the for loop to try to do it file by file. It would be your best bet.... Either a for loop, or find like you said, even ls and grep...
Well, the length of the command line is limited. This applies to other commands too in case you use * which is expanded by the shell. In your case I would suggest to use a dot instead of an asterisk.
Code:
ssh root@192.168.1.69 tar -C /data1/docimg -c -N10-22-2010 . | tar -xv
The * was in the command I was originally entering, so it does not work.
I tried with the "." about an hour ago. So far it hasn't copied anything, but it also hasn't returned an error, so it may just be looking through all of the files to find ones that match the criteria. I am not certain how long that should take on a list like that.
tar is showing up in top, though it is only showing 9.44 seconds of processor time since being started about 90 minutes ago. Its priority is 18 so it has a lot of smbd activity ahead of it.
You can save the output of the command cpio locally with a simple cat with a similar result, but cpio's format is a) different from tar ones, and b) your tar would archive blah.tar.gz to stdout while ignoring the stdin.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.