LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-18-2019, 04:02 AM   #1
grumpyskeptic
Member
 
Registered: Apr 2016
Posts: 467

Rep: Reputation: Disabled
How to split very large files to copy to an external hard drive?


Using Linux Mint 17.3 Rosa Cinnamon.

I have some very large files I want to move to an external hard drive that has a USB plug, but I get an error when I try to copy files much over 4GB.

I realise now that I should have re-formatted it to NTFS format when it was new, but it is too late as it has too much stuff on it.

The files are for example:

myfile.zip about 9GB

myotherfile.iso about 4.6GB

I have found out about the "split" command, but this seems to be for files with lines in them, unlike mine. Also, I have not be able to get a clear idea of what actual prefixes and suffixes to use.

Please note that I want to be able to put the file back together again in ten or more years time, so I need commands that are likely to be around for a long time.

Questions please:

1. What actual commands should I use to split the files?

2. What actual commands should I use to re-assemble the file parts?

3. Is it possible to assemble the file parts into a whole file on the external hard-drive, even though they were moved there in parts?

4. Is it possible to safely re-format the external hard-drive without losing the files on it? I have seen something on the internet for Windows that says it can do this.

Thanks.
 
Old 10-18-2019, 04:27 AM   #2
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,592

Rep: Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880Reputation: 5880
The split command has the -b option which will split the file based upon the number of bytes. Lots of examples can be found by searching the internet.

The cat command can easily reassemble the files. Suffix not real important.

No, If the drive is formatted as fat32 the max file size is 4GB, you can't reassemble the file on the usb drive.

If this is a cheap usb flash drive I would not want to use it as long term storage for 10 years. I would want to verify data which means not keeping split files.


Windows has a convert command which will convert fat32 to ntfs without data loss. I've never had to use it... However, always have a verified backup of important data just in case.

Last edited by michaelk; 10-18-2019 at 04:41 AM.
 
Old 10-18-2019, 04:37 AM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,103

Rep: Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117Reputation: 4117
Excellent response - as usual. Note that you can simply leave the segments there for assembly later, but fat is a pox; subject to regular corruptions even while at rest. Your data are exposed - use something better.
 
Old 10-18-2019, 10:03 AM   #4
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
man tar

Code:
<snip>
       -L, --tape-length=N
              Change  tape after writing Nx1024 bytes.  If N is followed by a size suffix (see the subsection Size suffixes below), the suffix
              specifies the multiplicative factor to be used instead of 1024.

              This option implies -M.

       -M, --multi-volume
              Create/list/extract multi-volume archive.
<snip>
your limit is 4GB - 1 byte

so just to keep life easy, have your tape length 3G

but play around with much smaller tape sizes / data sets to test archive / extraction of multi-volume

tar -L3G might be overkill,

you can probably just use split and reconstitute files with cat.
 
Old 10-18-2019, 10:39 AM   #5
Shadow_7
Senior Member
 
Registered: Feb 2003
Distribution: debian
Posts: 4,137
Blog Entries: 1

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
If it was formatted ext4 with 4k blocks (default) you could have file sizes larger than a terabyte.

You can use split, it has options, some split based on line count, some on bytes, some on number of desired chunks, and such. Which you could just cat together to a file to rebuild it.

$ split --bytes=4000000000 --suffix-length=2 --numeric-suffixes=01 file.dat file_dat_
$ cat file_dat_?? > file_new.dat

Of course none of this retains the permissions or date/time stamps of the original file(s). There are some compression formats that will generate chunks like rar. Not that useful for compression on already compressed (media) files. But will create chunks and preserve permissions and timestamps.
 
Old 10-19-2019, 08:38 AM   #6
grumpyskeptic
Member
 
Registered: Apr 2016
Posts: 467

Original Poster
Rep: Reputation: Disabled
Thank you Shadow_7 and others.

I find descriptions on the internet of what split does very confusing.

"$ split --bytes=4000000000 --suffix-length=2 --numeric-suffixes=01 file.dat file_dat_"

Does this mean that it will split a file called file.dat into chunks of 4GB, and name the chunks as file_dat_.01, file_dat_.02, etc? Would this work with .zip and .iso files?

What exactly would I need to do to turn myfile.zip of about 9gb into parts of 1gb: myfile.zipsplit01, myfile.zipsplit02.....?

And what would I need to do to turn myotherfile.iso of about 4.6gb into parts of 1gb, myfile.isosplit01, myfile.isosplit02.....?

Thanks.
 
Old 10-19-2019, 09:09 AM   #7
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
split does not care what the file actually contains


--bytes=4294967296 is 1 byte too many
--bytes=4294967295 is fine
--bytes=4000000000 is fine and easy

not much I can add

sometimes it is just quicker to experiment yourself instead of seeking permission or full instructions from someone else

you really shouldn't have too much difficulty figuring out what you have asked for yourself
 
Old 10-19-2019, 09:13 AM   #8
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,692

Rep: Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274Reputation: 7274
you can easily try how does it work on a small[er] file too.
https://www.linuxtechi.com/split-com...or-linux-unix/
 
Old 10-19-2019, 02:07 PM   #9
Shadow_7
Senior Member
 
Registered: Feb 2003
Distribution: debian
Posts: 4,137
Blog Entries: 1

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
Sometimes it's easier to try it, or read the source than asking questions and hoping someone that knows answers and answers correctly. Man pages can be confusing since there's many options that conflict. Like divide by lines / chunks / bytes for split. They're mutually exclusive, or trying more than one of them on the same command might find some interesting bugs, or unpredictable behaviors.

--bytes=4000000000

If that generates 4GB chunks, then:

--bytes=1000000000

Should generate 1GB chunks. At least in marketing definitions of 1GB. Actual 4GB chunk, or less than would be (2^32)-1 (has to be "less than" to fit).

2^32 == 4,294,967,296
-1 == 4,294,967,295

In terms of 1024K blocks definition of size versus marketing.

(2^30)-1 == 1,073,741,823

For 1GB chunks via 2^ definitions. Less than 1GB chunks.
 
Old 10-19-2019, 02:49 PM   #10
CoreyCananza
LQ Newbie
 
Registered: Oct 2019
Posts: 1

Rep: Reputation: Disabled
i believe zip folders will work for that
 
Old 10-19-2019, 02:58 PM   #11
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
I hate GB and GiB

I never remember which is which

GB gigabyte is 1000³
GiB gigibyte is 1024³

I have probably tried to remember it by saying to myself that GiB is "larger" than GB so that is the bigger number.
But I always have to doublecheck
 
Old 10-19-2019, 05:30 PM   #12
Shadow_7
Senior Member
 
Registered: Feb 2003
Distribution: debian
Posts: 4,137
Blog Entries: 1

Rep: Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874Reputation: 874
It's all GB to me. 1024 base, not base 10. AKA powers of 2. We existed before marketing ignorance. And then there's bits versus bytes for networking and media types. Probably a good thing I don't fill out "standardized" tests of multiple choice. Where all the answers are technically correct depending on what department / field of study you specialize in. And yet no option says all of the above. Like most things high school, you have to give the answer that you think the teacher/tester wants, not the answer you believe is correct.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: How to split a large archive*file into multiple small files using Split command in Linux LXer Syndicated Linux News 0 11-07-2016 05:20 PM
Split large file and transfer to windows external drive Assagay52 Linux - Newbie 2 05-17-2011 01:42 AM
[quick] trying to split a large file but linux says it's to large steve51184 Linux - General 16 05-06-2008 07:40 AM
Split Large Very Files (Software) kolmogorov Solaris / OpenSolaris 5 11-18-2005 11:46 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 03:20 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration