LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (http://www.linuxquestions.org/questions/slackware-14/)
-   -   simple bullet-proof backup (http://www.linuxquestions.org/questions/slackware-14/simple-bullet-proof-backup-4175431048/)

lazardo 10-07-2012 11:48 PM

simple bullet-proof backup
 
I was thinking about security/backups/rsync/RAID et al and focused on the more catastrophic (flood, house fire, explosion and theft), which pointed to "What would be most painful?", which pointed at home directory (projects, passwords, account numbers, email, other emotional trivia), digital images and music, and transcoded video.

Remote systems are (unreliable|impractical|inadequate|slow) as you approach +100s of MB and so ended up with a simpler model: a USB3 SATA enclosure and two 1TB WD green disks (about 140USD total).

The init process:
a. Rsync images, videos and music, tar+compress+encrypt /home -> Disk_A
b. Copy Disk_A -> Disk_B (involved opening case and spare SATA slot)
c. Drop off Disk_A at [assumed reliable] relative/friends house.

Post-init:
1. Frequently update Disk_B.
2. Periodically swap disks.
3. Repeat this cycle.

I decided against LUKS (Occam's) and went for spot encryption. Here's a LAN test for 5GB /home across GigE on the two fastest machines. Probably /etc and /var should be added to the encryption soup.

Source: 2.6.37.6c #8 SMP x86_64 i3 CPU M 330
Target: 3.2.13 #1 SMP x86_64 AMD Phenom(tm) II X4 B55

Test 1, source does compression - assume faster backup due to pre-compressed data across network.
Code:

# SOU=hp_lap; DIR=home; date
# (
> ssh $SOU "cd /; sudo tar cpf - $DIR | /usr/bin/time pigz -p 3 -nT -1" |
> tee >( md5sum > $SOU.$DIR.md5 ) >( /usr/bin/time openssl enc -blowfish -pass env:PWd |
> tee >( md5sum > $SOU.$DIR.enc.md5 ) > $SOU.$DIR.enc )
> ) > /dev/null
# date
Sun Oct  7 19:10:05 PDT 2012
425.76user 15.06system 3:34.73elapsed 205%CPU (0avgtext+0avgdata 14352maxresident)k
32inputs+8outputs (1major+961minor)pagefaults 0swaps
46.21user 8.03system 3:34.92elapsed 25%CPU (0avgtext+0avgdata 8112maxresident)k
0inputs+0outputs (0major+563minor)pagefaults 0swaps
Sun Oct  7 19:13:40 PDT 2012

Test 2, target does compression - assume faster backup due to extra horsepower.
Code:

# date
# (
> ssh $SOU "cd /; sudo tar cpf - $DIR" |
> /usr/bin/time pigz -p 3 -nT -1 |
> tee >( md5sum > $SOU.$DIR.md5 ) >( /usr/bin/time openssl enc -blowfish -pass env:PWd |
> tee >( md5sum > $SOU.$DIR.enc.md5 ) > $SOU.$DIR.enc )
> ) > /dev/null
# date
Sun Oct  7 19:15:33 PDT 2012
180.50user 10.50system 3:08.85elapsed 101%CPU (0avgtext+0avgdata 21360maxresident)k
0inputs+0outputs (0major+969minor)pagefaults 0swaps
36.91user 4.93system 3:08.85elapsed 22%CPU (0avgtext+0avgdata 8112maxresident)k
0inputs+0outputs (0major+562minor)pagefaults 0swaps
Sun Oct  7 19:18:42 PDT 2012

In this case horsepower wins (3:34 vs 3:08) which is not what I would have guessed. Test, then implement as they say.

$ md5sum hp_lap.home.enc; cat hp_lap.home.enc.md5
4c054de8bfc522d2c77c91e704584ac6 hp_lap.home.enc
4c054de8bfc522d2c77c91e704584ac6 -

Takes care of even the California tendency to slide in to the sea and harbor sloppy public natural gas utilities.

Cheers,

AndyGrove variant:
Code:

SOU=my_puter
export PWd="my long and hard to crack passphrase"
CIP=my-better-cipher

for DIR in etc var home; do
 echo "$DIR - `date`"
 ( ssh $SOU "cd /; sudo tar cpf - $DIR" |
 tee >( md5sum > $SOU.$DIR.tar.md5 ) >( pigz -p 3 -nT -1 |
 tee >( md5sum > $SOU.$DIR.tgz.md5 ) >( openssl enc -$CIP -pass env:PWd |
 tee >( md5sum > $SOU.$DIR.enc.md5 ) > $SOU.$DIR.enc ))
 ) > /dev/null
done
echo "done - `date`"

ls -l $SOU.$DIR.*
for DIR in etc var home; do
 echo $SOU $DIR ======
 cat $SOU.$DIR.enc.md5
 cat $SOU.$DIR.enc | md5sum
 cat $SOU.$DIR.tgz.md5
 cat $SOU.$DIR.enc | openssl enc -d -$CIP -pass env:PWd | md5sum
 cat $SOU.$DIR.tar.md5
 cat $SOU.$DIR.enc | openssl enc -d -$CIP -pass env:PWd | pigz -d | md5sum
done


Mark Pettit 10-08-2012 01:56 AM

I think it's worth noting a few extra points here.

1) Having a backup (external) disk within a few meters of your source data won't solve data loss under fire, earthquake, theft. I prefer having 2 disks, one of which is taken to another location (a friend in another suburb, work-office etc).
2) Backup frequency is determined by answering this question : How much am I prepared to lose ?
3) Backup only what needs to be backed up - this simply means don't backup stuff that can easily be recovered by other means. So don't backup all the Slackware /usr/bin, /usr/share etc, but do backup your /etc dir (settings), /home and maybe some other stuff. Photo's , email. Don't bother with movies as this can invariably be recovered elsewhere (especially if you were generous with them :-)
4) Use rsync. People get tired of waiting for backups to complete, and then become lazy. But if the backup process only takes 2 minutes, then we'll happily do it. Rsync only backups what has changed.
5) Sh1t happens !

BlackRider 10-08-2012 05:18 AM

Quote:

4) Use rsync. People get tired of waiting for backups to complete, and then become lazy. But if the backup process only takes 2 minutes, then we'll happily do it. Rsync only backups what has changed.
Yes, rsync is very handy, but many other solutions offer selective backupping based on changes. The algorythm rsync uses, by the way, can skip some changed files, because it uses a limited amount of parameters for detecting changes which are not bullet-proof. You can use the "checksum" option, but then the backup will take ages to complete...

The risk is minimal anyway. I found the particular problems in an article somewhere, but I can't find it right now.

jtsn 10-08-2012 12:05 PM

I use ext4, LVM snapshots and the Unix tool dump for Backups. It's way faster than file-based backup, can do incremental/differential backups, saves all file attributes, handles sparse files, supports hard-links and doesn't change anything on the source partition (esp. atime and ctime). Cloning a filesystem using dump/restore is a cakewalk.

Unfortunately dump is not part of Slackware, but it's available on slackbuilds.org (although it's not the most recent version).

ChrisAbela 10-08-2012 03:14 PM

Quote:

Unfortunately dump is not part of Slackware, but it's available on slackbuilds.org (although it's not the most recent version).
I am aware that dump on SBo is not the most recent version. I had spent quite some effort to update dump as I am the maintainer but I had problems to compile it on Slackware. I do not remember the reason but if I remember well, the same problem was reported on Debian. I had written upstream for guidance but none was forthcoming.

An alternative solution to dump might be bacula and burp. SlackBuild for these may be found in the SBo repo.

Chris

jtsn 10-10-2012 03:04 AM

Quote:

Originally Posted by ChrisAbela (Post 4800612)
I am aware that dump on SBo is not the most recent version. I had spent quite some effort to update dump as I am the maintainer but I had problems to compile it on Slackware. I do not remember the reason but if I remember well, the same problem was reported on Debian. I had written upstream for guidance but none was forthcoming.

Seems like an autohell issue. Configure screws it up. It misses -lpthread -lcom_err from GLIBS in dump/Makefile and adds -DTRANSELINUX to ALL_CFLAGS and -lselinux to LIBS in restore/Makefile.

Both issues break the build on Slackware, but I don't now enough about autoconf to fix the respective .in files.


Edit:

Looks like gcc tries to link -lext2fs statically, because it accesses /lib/libext2fs.a. This should not happen. It should use libext2fs.so, but this is in /usr/lib/libext2fs.so. The same thing happens to com_err. This is caused by
Code:

# Fix up package:
mkdir -p $PKG/usr/lib${LIBDIRSUFFIX}
mv $PKG/lib${LIBDIRSUFFIX}/pkgconfig $PKG/lib${LIBDIRSUFFIX}/*.so \
  $PKG/usr/lib${LIBDIRSUFFIX}
( cd $PKG/usr/lib${LIBDIRSUFFIX}
  for i in *.so ; do
    ln -sf /lib${LIBDIRSUFFIX}/$(readlink $i) $i ;
  done
)

in e2fsprogs.SlackBuild

This breaks the dynamic linking of libext2fs, if you use the output of
Code:

pkg-config --libs ext2fs
, which is
Code:

-L/lib -lext2fs
on Slackware 14.0.

So I think, it's a bug in the e2fsprogs package (in either the pkgconfig files or the SlackBuild script).

Restore can be fixed by ./configure --enable-transselinux=NO (The SlackBuild already does this, missed it.)

jtsn 10-10-2012 04:22 AM

This is a workaround for the problem described above:

Code:

--- dump.SlackBuild        2012-10-02 19:59:14.000000000 +0200
+++ dump.SlackBuild        2012-10-10 11:12:53.436330689 +0200
@@ -6,7 +6,7 @@
 # 30.07.2010
 
 PRGNAM=dump
-VERSION=${VERSION:-0.4b43}
+VERSION=${VERSION:-0.4b44}
 BUILD=${BUILD:-1}
 TAG=${TAG:-_SBo}
 
@@ -54,6 +54,7 @@
 
 # rmt is available on Slack's tar package, so I am disabling it.
 # The fully qualified mandir is necessary.
+EXT2FS_LIBS="-lext2fs -lcom_err" \
 CFLAGS="$SLKCFLAGS" \
 ./configure \
  --prefix=/usr \

It prevents configure from using pkg-config.

BTW: Low-level file system utilities like dump/restore belong to /sbin. This would also resolve the conflict with the /usr/sbin/restore from the tar package. Although restore uses libreadline, which resides in /usr/lib, so this library should be linked statically into restore to make it work without having /usr mounted. This patch does this:
Code:

diff -ru dump-0.4b44.orig/configure.in dump-0.4b44/configure.in
--- dump-0.4b44/configure.in        2011-05-23 10:32:23.000000000 +0200
+++ dump-0.4b44/configure.in        2012-10-10 11:42:25.473478895 +0200
@@ -508,7 +508,7 @@
        fi
 fi
 if test "$READLINE" = yes; then
-        READLINE="-lreadline $rdllib"
+        READLINE="-Wl,-Bstatic -lreadline $rdllib -Wl,-Bdynamic"
 fi
 AC_SUBST(READLINE)

You have to add a call to autoreconf to the SlackBuild script, to make it work and of course change the prefix to /.

ruario 10-10-2012 05:57 AM

I wouldn't gzip compress the tar archives. I gave my reasoning in another thread so I'll just quote myself. ;)

Quote:

Originally Posted by ruario (Post 4790081)
you might want to reconsider gzip compressed tars because a single corrupt bit near the beginning of the archive means the rest of the file is a write off. This is less of an issue when using an actual disk for backup as opposed to media like DVDs, Blu-ray, etc. but still something to consider. Personally I would either skip compression or use xar, dar or afio instead, all of which can compress files individually as they are added (afio gives you the most compression options, since you can specify any compressor you like). This is safer as any corruptions will mean only losing some of you files. Alternatively (or better yet in addition) look at making parity archive volume sets. Check out the par2cmdline utils, an implementation of PAR v2.0 specification.


lazardo 12-10-2012 05:02 PM

Quote:

Originally Posted by ruario (Post 4802007)
I wouldn't gzip compress the tar archives. I gave my reasoning in another thread so I'll just quote myself. ;)

Thanks to your comment I am now adding PAR2 recovery to the compressed archives.

Cheers,

ruario 12-11-2012 01:51 AM

Quote:

Originally Posted by lazardo (Post 4846581)
Thanks to your comment I am now adding PAR2 recovery to the compressed archives.

Cheers,

Cool, that will help if you have a problem, though I would still advise using internal compression with a Afio, Dar or Xar container over externally compressed tarballs (or drop compression altogether if you have the space). PAR2 recovery will allow you to perfectly fix a certain percentage of data corruption (depending on what you set for "Level of Redundancy"). If you have a higher amount of corruption, your par files will be of no use but with internally compressed archives you could still recover all but the damaged files (that is why I said, "or better yet in addition" above).

Perhaps you would consider this an overkill but then you did say this would be bulletproof backup! ;)

EDIT: Ah, you are encrypting anyway. In that case ignore the above.

konsolebox 12-11-2012 03:48 AM

You could actually use a filesystem with transparent compression. It's better that way since files are easily accessible. If you need encryption you could encrypt the filesystem. Haven't tried it yet but there are ways to do that just like here: http://wiki.centos.org/HowTos/EncryptedFilesystem. For filesystems with transparent compression there's a list of those here: http://en.wikipedia.org/wiki/Comparison_of_file_systems. LUKS is also portable to other platforms btw: http://www.freeotfe.org/

neymac 12-11-2012 08:45 AM

For the most important stuff I use Dropbox to backup, but I have only 3.25 Gb free of charge to use, and used only 2 Gb to store my most precious data (files and folders). It is automatic (the files are updated as they change, and the the changes are recorded as well).

NeoMetal 12-11-2012 02:40 PM

If you do incremental or differential backups remote backups become more feasible and might be more likely to save data in the catstrophes mentioned then a disk in the same building

duplicity is a cool utility for encrypted, remote differentials

qweasd 12-11-2012 03:45 PM

#op: How many restore points do you have, then? How old of a restore point can you guarantee at any time?


All times are GMT -5. The time now is 05:00 PM.