Split a large file to download on a windows machine
Hi,
I am removing some old graphics from my server and one of the gallery programs have created two enormous directories that I cannot even open with FTP. I tried to tar each directory and the first came out to about 37gb and the second keeps failing (its bigger one would assume). How can I archive and split these into smaller files? I would sincerely appreciate any help you can give, Thanks, -Jason |
Make a list of all files:
Code:
ls > allfiles Code:
split -l1000 allfiles CHUNK Code:
for i in CHUNK* ; do tar cf $i.tar -T $i ; done |
Awesome,
Tried but for some reason near the end I got a bunch of these errors, did an ls and was able to verify that these files are there: tar: xyz55823.jpg*: Cannot stat: No such file or directory tar: xyz55824.jpg*: Cannot stat: No such file or directory tar: xyz55825.jpg*: Cannot stat: No such file or directory tar: Error exit delayed from previous errors trackpad@trackpads.com [~/www/path/to/2658]# |
Also, for some reason the first chunk file is 23gb and all the subsequent ones are only 10k.
|
Quote:
Your other problems may be related to this. |
Firstly, from your title, I'm not certain which machine is Windows. The client or the server. Since is seems you ran tar on the server, is sounds like the client is Windows.
Is it the tar command that failed before it got to the split command. You may simply have some bad files or files that were altered after you started tar. Tar does have a volume size option, which would allow you to create separate files. Another option is to pipe the output of tar. For example: ssh user@server tar -C <directory> czf - . | cat >gallery.tar.gz You could instead use tar to replicate the files themselves: ssh user@server tar -C <directory> czf - . | tar -C <restore_directory> xzvf - >logfile If the client is a windows machine, you could run a live linux distro or Cygwin to run the ssh and tar commands. The problem you are having is that there are too many files in the directory rather than its size. For example, trying "ls *.jpg" may cause an out of memory error. This is because the wildcard is expanded and sorted by the shell before the command is executed. Using ftp, a list of all of the files in the directory may be produced so it can be sorted as well. Sorting is expensive time wise. One thing that is often done in the case where there are 10s of thousands of files or more in a directory is to use the find command instead of ls, and to limit the number of arguments to a command using xargs to handle the list produced by find. If you use tar without wild card arguments, you probably won't have the problem I mentioned. Files are added as they are found. |
Hi,
I have not tried smallpond's solution but it looks ok to me. One thing I would change, though, would be ls > allfiles to ls -1 > allfiles This way you can assure every tar command will archive 1000 files. As for the size issue of the chunks, maybe you have some very long filenames? Or it ran out of output filenames and put the rest in the last chunk? But this should have given an error message. Try this to mimic the split command Code:
c=0;d=1;while read -r line; do echo "$line">>chunk$c;if [[ $((++d)) > 1000 ]];then ((++c)); d=1;fi;done < allfiles Finally, the file not found issue. Here is a very far fetched idea: My guess would be that, - you put "double-quotes" around "$i" in smallpond's example and - your 'ls' command is really aliased to 'ls -F' and - the files in question somehow got falsly execute permissions assigned. The output of Code:
alias ls |
From your original errors :
tar: xyz55823.jpg*: Cannot stat: No such file or directory Is there an existing file named xyz55823.jpg* ? |
Guys, thanks again,
I got those downloaded but now have an even bigger single files archive. I am moving from a linux host to a windows host. The file I need to download is my cpanel backup which is about 200gb. It is a tar.gz file. Can I split that into increments of say 500mb and then reassemble them on my windows box? Thanks again, -Jason |
Quote:
how did you solve the previous problems you encountered, i.e. the '.jpg*: Cannot stat: No such file or directory' issue? Some feedback would be nice for others who might stumble upon the same problem. As for your new problem, after a quick web search I found this tool: http://stahlworks.com/dev/index.php?tool=split the description it sounds promising I have never used this tool before. So if you want to try then make a backup first and do some testruns with smaller dummy-files. Hope this helps. |
I ended up using some partial work as well as just making one enourmous archive of my cpanel.
How do you install the sfk thing on linux? I only have shell access, looks nifty if I can get it to work. Thanks, -Jason |
ok, here is another try, it seems to be working,
Use this script to install rar: http://nixcraft.com/shell-scripting/...rar-linux.html Then rar with a command such as 'rar a -r -v200m forum.rar forum' |
All times are GMT -5. The time now is 06:29 AM. |