Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
What's the best way in Linux to backup files and folders?
Currently my personal files are 48.5 GB large, 38,466 files with 2,971 sub-folders. I try PCManFM and Dolphin FM, but they end up failing half way through around 50% of the time.
I have right clicked in PCManFM to compress, but after it compresses everything, when go into the tar.gz with archive manager, not all folders are there!
Maybe is entering some command in Terminal the only secure way?
Last edited by andrew.comly; 12-26-2012 at 10:38 AM.
Currently my personal files are 48.5 GB large, 38,466 files with 2,971 sub-folders. I try PCManFM and Dolphin FM, but they end up failing half way through around 50% of the time.
I assume you have a separate space to back up to.
Quote:
Maybe is entering some command in Terminal the only secure way?
There are of course good commercial backup utilities for Gnu/Linux, but the same objective can be attained by simple terminal commands.
were are you putting the backup files? are they going to a NAS, to an other computer, to the same computer in a different directory? were.... this makes a slight difference in how best to backup the data.
if going to a NAS, then you might want to just look into either mounting the CIFS mount point and moving the data around, or you could even activate FTP on the NAS and use lftp to transfer a tarball, see above on how to create, to the NAS.
if going to an other Linux computer then rsync with ssh keys would be the best way to go.
rsync would also be a great way to move the data around locally too.
I typically will use:
rsync -aviS /souce/files /destination/
for my backups. with this much data you will see a network hickup so you might want to run this at night via cron, or you might want to consider adding -z in there for compression. see the man rsync for more details.
You can create a .tar of multiple files, and then compress the tar file.
To create a .tar:
Code:
tar -cvf sample.tar /path/to/file(s)
To check it's content:
Code:
tar -tvf sample.tar
To compress:
Code:
gzip /path/to/file
To check content of tar+zip file:
Code:
tar -ztvf /path/to/file
After creating a tar/zip, take backup of it in some external drive.
The above "sudo tar -cvf AC20121229.tar /mnt/D/AC" yields the following:
tar: AC20121229.tar: Wrote only 4095 of 10240 bytes
tar: Error is not recoverable: exiting now
"
To those recommending gzipped tar files, that is a bad idea. Personally I would copy the files (perhaps with rsync) to an external disk using a linux formatting scheme. If you need to copy to a disk or network drive where UNIX file permissions and attributes cannot not be maintained (e.g. using a disk using a Windows file format) and want to use an archive format to retain permissions/attributes, consider the implications of gzipping the archive itself.
To expand I'll link to a post I made previously on this topic:
Quote:
Originally Posted by ruario
you might want to reconsider gzip compressed tars because a single corrupt bit near the beginning of the archive means the rest of the file is a write off. This is less of an issue when using an actual disk for backup as opposed to media like DVDs, Blu-ray, etc. but still something to consider. Personally I would either skip compression or use xar, dar or afio instead, all of which can compress files individually as they are added (afio gives you the most compression options, since you can specify any compressor you like). This is safer as any corruptions will mean only losing some of you files. Alternatively (or better yet in addition) look at making parity archive volume sets. Check out the par2cmdline utils, an implementation of PAR v2.0 specification.
EDIT 1: And if you won't take my word for it here is what "UNIX Power tools, 3rd Edition (O'Reilly)" has to say:
Quote:
Originally Posted by Section 38.5.4. To gzip, or Not to gzip
Although compression using gzip can greatly reduce the amount of backup media required to store an archive, compressing entire tar files as they are written to floppy or tape makes the backup prone to complete loss if one block of the archive is corrupted, say, through a media error (not uncommon in the case of floppies and tapes). Most compression algorithms, gzip included, depend on the coherency of data across many bytes to achieve compression. If any data within a compressed archive is corrupt, gunzip may not be able to uncompress the file at all, making it completely unreadable to tar. The same applies to bzip2. It may compress things better than gzip, but it has the same lack of fault-tolerance.
EDIT 2: If you do ever need to attempt recovery an important gzipped file, you should read this to see exactly what is involved.
Last edited by ruario; 12-28-2012 at 05:17 PM.
Reason: Added a further quote; added a link to gzip recovery info
Older versions of gzip had problems decompressing files larger than 4Gb (e.g. gzip 1.2.4 had such an issue). Your member info that that you use Lubuntu 12.10 so in theory this should not be a problem (as it ships with gzip 1.5) but perhaps you are describing an issue on another machine with an older distro (and hence an old gzip)? If yes, that could be your problem.
There are multiple tar implementations, which can use different default formats (additionally some distros compile GNU tar with different defaults). Modern GNU tar should have no problem with such a large archive as long as you are using GNU or PAX formats. To be 100% sure one of these is being used I would specify either --format=gnu or --format=pax.
That all said, I would once again suggest either doing a straight copy (perhaps with rsync), using a tar without compression or using another archive format that can do internal compression.
Your "~$ rsync --archive /personal/files /mounted/volume/" was most helpful. In addition, I did add on a few extra options after reading the "rsync --help" in detail to make the following:
rsync --archive -vE --delete --stats /media/a/AC/Recent/AC/ /AC
I found the above "--stats" option quite useful, giving the essential number of files information as the attached jpg indicates. Unfortunately, no "number of folders" info given, and worst of all due to my system crashing after I tried a "tar --format=gnu" command, I am reluctant to install dolphin(it's built for Kubuntu, not Lubuntu).
How can terminal be used to look up the # of folders information for the above job, and how may I obtain both the # of files and folders in the source folder? Indeed part of backing up information should definitely include verifying copied information matches source information exactly, this detail is quite essential.
Last edited by andrew.comly; 12-31-2012 at 08:45 PM.
The tar command just didn't work, I tried the "--format=gnu" option and it only copied ~20% of my information.
Strangely enough a few minutes after this I encountered a strange crash error message that took me to the official KDE report bug reporting website, but due to not having some 'gpd' (or something like this) program installed, I just wasn't able to report the vital crash information. Even when I then opened terminal up using the "sudo apt-get xxx" didn't work, giving me the erroneous information that my disk space wasn't enough. This is quite absurd due to the fact on my Samsung at that time the drive I used for my personal files was on a different partition that the system files, more specifically: sys drive - 80GB; personal file drive 100GB, with no more that 15GB used on the system drive. More specifically terminal reported that there was insufficient space needed to install the mere 5KB of space the '~gpd' program would require.
I wonder if the crash is due to the fact that I was running dolphin FM and PCManFM concurrently, dolphin is originally made for KDE/Kubuntu, not Lubuntu. This might explain the KDE ladybug crash report website.
Anyways, I then rebooted to find the dreaded black page with only a blinking cursor. I then rebooted from the Lubuntu 12.10 flash drive trying first the "~reinstall system only" option (the option to re-install Lubuntu without deleting installed programs) which didn't work, so I ended up having to completely reinstall everything. I don't certainly don't blame your advice for this, but I just thought I should offer you information of what happened after I tried the above “tar –format=gnu SOURCE TARGET” command.
Certainly the rsync command should do the job, except I don't know how to verify # folders/files on both the target and source files match now without dolphin. I dare not install dolphin again, is there some way to use terminal to obtain this information?
Last edited by andrew.comly; 12-31-2012 at 09:20 PM.
Reason: grammar
Certainly the rsync command should do the job, except I don't know how to verify # folders/files on both the target and source files match now without dolphin.
You can monitor everything what rsync is doing --even what errors are happening behind-- into a file which you can examine by grepping or searching on it whatever you query later after the job is done: this means you can put everything on a record file.
Now, when job is done you can ask anything from the record: error, file-names, folder-name, etc.
Code:
grep -li filename myrecord.txt or
grep error myrecord.txt
Or you can open the myrecord.txt with any text editor then "Ctl+F" search for whatever-foo <press Enter> --you know that already .
BTW, just a point of reminder: if you deal with huge quantity of files on huge volumes it is most advisable to use the terminal (rsync), not file managers like dolphin where memory resource is handled differently.
Hope that helps.
Good luck.
Last edited by malekmustaq; 01-01-2013 at 12:04 AM.
Thanks for your idea. Unfortunately when trying this I encounter the following message error message (screen snapshot attached):
a@SAM:/$ sudo rsync --archive --delete --vEu --stats /AC /a/home/AC >myrecord.txt 2>&1
<Enter>
bash: myrecord.txt: Permission denied
The above error message is still present when writing "sudo" in front of the above command, even when I make a "myrecord" file (no .txt since a leafpad document) before executing the above command in the "/" directory, the same "bash: myrecord.txt: Permission denied" is still encountered.
rsync: Source & Destination still doesn't exactly match.
malekmustaq / All others :-),
So after I ran the “rsync --archive --delete -vE -u --stats /AC /home/a/AC” command, and it seemed to work. Next, in order to check a match between the source and destination files, I looked up on a Mac OSX hints webpage and discovered the concept of combining the two commands of “ls” and “wc” together, and received the following result:
a@SAM:/$ cd /AC
a@SAM:/AC$ ls -R | wc -l
46789
a@SAM:/AC$ cd /home/a/AC
a@SAM:~/AC$ ls -R | wc -l
46781
I didn't take any Computer Science classes, but isn't the subject of computers supposed to be a “hard” science? If so, why am I still short 8 files? I guess computer science just isn't as much of a "hard science" as mathematics or physics! {hard science = science in which facts and theories can be firmly and exactly measured, tested or proved, as opposed to soft science, e.g. sociology or economics}
Sincerely,
Andrew
Last edited by andrew.comly; 01-05-2013 at 10:58 AM.
Reason: grmr
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.