LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Make sure that you BACKUP your DATA!! (https://www.linuxquestions.org/questions/linux-software-2/make-sure-that-you-backup-your-data-4175584881/)

beachboy2 07-17-2016 09:15 AM

Make sure that you BACKUP your DATA!!
 
For the umpteenth time on LQ forums, do BACKUP your personal data!

If it is not backed up, it is not a matter of IF you are going to lose it, it is simply a matter of WHEN you are going to lose it.

See post #9 here:

http://www.linuxquestions.org/questi...33#post5577333

Emerson 07-17-2016 09:51 AM

Computer users fall into two groups:
0. those that do backups
1. those that have never had a hard drive fail.

jpollard 07-17-2016 09:53 AM

Quote:

Originally Posted by Emerson (Post 5577349)
Computer users fall into two groups:
0. those that do backups
1. those that have never had a hard drive fail.

There is actually third...

2. those that have never accidentally deleted their data...

:)

sundialsvcs 07-17-2016 10:21 AM

Ouch ... this sounds like yet-another "voice of recent and painful experience."

If you do a Google DuckDuckGo search on linux automatic backup rsync, you will find hundreds of hits that describe various ways to use the rsync command to make constant backups of your data. (Apple's now-famous Time Machine basically works in just that way.) rsync is an extremely sophisticated command, brimming with useful options.

The most critical consideration is that the backups must occur continuously, sending the data to an external drive (or, better yet, several of them). They should also protect the backed-up data from theft, eavesdropping, or (malicious) interference ... a task that can be accomplished by placing all of the backups into a private folder that only some _backup user owns. [i](A user that, via /bin/nologin, can never "log in" to the system.) And, there should be some way to separate multiple versions of the backed-up material.

But there's another consideration: "a static backup is never good-enough for a database." A database needs replication, and MySQL is very good at it. "Even an hour" is much too long for data from a production database. You need several read-only copies of the data that are always known to be in-sync with one another "right now." (This is also useful for reporting purposes, since reports can be taken from replicas.)

Emerson 07-17-2016 10:25 AM

rsnapshot is my favorite, it can use hard links, saves lots of space.

DavidMcCann 07-17-2016 10:26 AM

Another tip is to have two sets of backup data and use them alternately. I once found I'd backed-up a corrupted file, but the other backup still had a good copy. I back-up daily: with rsync only the changed files are altered, so it's a quick job.

jpollard 07-17-2016 10:30 AM

Quote:

Originally Posted by DavidMcCann (Post 5577370)
Another tip is to have two sets of backup data and use them alternately. I once found I'd backed-up a corrupted file, but the other backup still had a good copy. I back-up daily: with rsync only the changed files are altered, so it's a quick job.

Which is also a recommendation for incremental backups.

I have also seen where some people use a CMS to keep backups - and only archive entire files if they are new, everything else is just incremental deltas to the files.

Emerson 07-17-2016 10:31 AM

That's what is great about rsnapshot, it creates daily backups - as many as you want to keep - and won't waste space because it knows how to use hard links.

beachboy2 07-17-2016 12:21 PM

For those who may prefer rsync with a GUI (Graphical User Interface), there is Grsync:

http://www.opbyte.it/grsync/

Review (old but still valid):

http://www.dedoimedo.com/computers/grsync.html

For Debian, Ubuntu, Linux Mint etc, install as follows:

Code:

sudo apt-get install grsync

Habitual 07-17-2016 04:38 PM

Backups are Useless if they aren't Tested?
 
"Users". Seriously, it's a bummer and I feel, but
Everyone has to learn the lesson (partitioning), IMO.
It is inevitable. No backups however, is inexcusable.

Have we not all done this exactly same "oops" (selected the wrong 'one') ?
I did this exact same thing once. I printed http://www.tldp.org/HOWTO/Partition/index.html
and worked it out.

I most certainly wouldn't be responsible for someone's data loss if they have no backup.
I rsync several times a day whether I need it or not. ;)

Doug G 07-17-2016 05:48 PM

+1 for testing your backups.

Back when, a large corporate customer had a DEC system with tape backup, and faithfully did daily backups to tape. When their SMD hard drive failed after a couple years, it turned out there was some hardware bug in the tape controller (a DEC compatible) and guess what, they did not have a single good backup tape! This failure equated to a couple managers being fired and the loss of a few man-years of circuit design work, and if I were to guess the loss of multi-millions of dollars.

astrogeek 07-17-2016 06:20 PM

If it hasn't been verified, it isn't a backup. How is that not obvious?

sgosnell 07-17-2016 06:32 PM

My personal files aren't worth millions of dollars, and I don't think I would fire myself, but I do keep multiple backups. I have 3 drives, one of them SSD, and I keep them synced with Dropbox, Box, and Google Drive. I use rclone via cronjobs to keep everything in sync. Rsync works, and rclone uses it as a backend, but it doesn't do cloud syncing, AFAIK. All the files that I have any worries about having anyone else access are encrypted with gpg before being put in any folder or drive that is synced to the cloud. I have a local veracrypt volume for storing the unencrypted versions. My KeePass database is also synced so I can access it anywhere, but it's also locally encrypted. That's probably the file that would hurt the most to lose, but the odds of losing every copy of it are very long.

syg00 07-17-2016 07:20 PM

So now we come to the elephant in the room. How are backups "verified" ?.

rsync, while excellent, doesn't check files at the target for validity - not by default anyway.
So those backups are just sitting there waiting for bitrot to occur undetected. Some filesystems can detect this (btrfs and zfs for example), but you need to be using them at the target as well. I also take intermittent CRC protected full filesystem backups.

FWIW, it seems the issue mentioned in the initial post is still unresolved - I answered another post last night.

astrogeek 07-17-2016 08:44 PM

Quote:

Originally Posted by syg00 (Post 5577568)
So now we come to the elephant in the room. How are backups "verified" ?.

Not so much an elephant as a very good question! But one that has an answer!

Of course it varies with the nature of the thing being backed up, but needs to be done to a level of confidence proportional to the value of the data!

Checksums at source and archive are a good first measure.

As you note, the best, and only way to know for sure is to use the backup, take it live.

This is the method I use for high value, critical data.

My own strategy is to keep at least one live backup of critical sub-systems, and one or more copies of archived offline backups. When I take a backup of some part of my data, I create the archive and then "restore" from it to my live backup platform. In important cases I maintain and run validation test cases against the restored sytems.

This provides the confidence that the checksummed offline backup is valid, and also gives me a fully functional, always available online fallback system.

The live backups are never system-wide backups - I do not think that is a best strategy. Incremental backups of smaller parts, separately managed, separately verified, according to their separate requirements. Like is often said of security, backups are a process, not a program.

Importantly, you should differentiate between data and system backups - data is the all important thing! Systems can be rebuilt, data must be recovered!

I cringe when I see discussion of hard drive images being used as "backups"! That is like taking the express train to oblivion in my experience! (I know! I did that long ago - two backup hard drives mirroring my critical system - both ultimately failing in the same way, at the same time as the main system, for the same reasons...)


All times are GMT -5. The time now is 01:36 AM.