LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Backup strategy & solutions (https://www.linuxquestions.org/questions/linux-general-1/backup-strategy-and-solutions-4175422337/)

workent 08-16-2012 12:38 AM

Backup strategy & solutions
 
Hi all,

I've been thinking recently that I need to develop & deploy a more comprehensive backup strategy than the ad-hoc "solution" I use at present. Been putting feelers out for opinions on IRC recently, but thought I should do this is a more structured (& shared) manner

Some of my current setup:
* FreeNAS with RAIDZ2 (soon RAIDZ3) - RAID 6
* Ubuntu (currently 12.04.x) server running KVM on LVM2 partitions
* A mix of clients, including various POSIX desktops (Ubuntu, Centos/Fedora, Mac, Android, etc), and a few windows instances for testing
* Gigabit ethernet

I have a few developments in the works, and a *lot* of "dogfooding":
* In the process of building a redundant FreeNAS with RAIDZ3 for testing
* Moving my VM data onto my NAS & connecting via iSCSI
* Popping some SQL VM's for replication
* PXE to simplify triage & cloning
* Start to scale down the use of heavy clients & start making use of devices like Hackberry/Raspberry Pi to act as thin-clients for VDi or terminal services (remote-X, VNC, RDP, NX, SPICE, etc), so that I can nuke any client & be up & running to my previous state over PXE - within a margin of 1 day

What I want to implement at the end of the day:
* Incremental client backups of hosts/data to the server
* SQL replication to a dedicated (sandboxed) VM
* Host pauses & makes weekly snapshots of VM's
* Redundant NAS does a daily rsync of data on primary NAS
* NAS makes weekly/monthly archives of backed up data & pushes it to the secondary NAS, to removable HDD's/media and/or to an off-site facility, such as TarSnap (Amazon, pier, colo, whatever)
* Periodically test & nuke any archives older than a month

Now I know that rsync (+tar) is the de-facto standard for POSIX, and short of me getting jiggy with with some bash scripting & cron (& probably end up making some really catastrophic typo's), I'm looking for some established systems (& methodologies) I can use for the purpose.
There are a few good references available on the subject:
* http://wiki.linuxquestions.org/wiki/Backup
* https://help.ubuntu.com/community/BackupYourSystem
Both are good resources, but do not delve into much detail on the systems listed.

There are a few criteria that I'm trying to keep in mind: (please keep in mind that this is not only for myself, but stem from interactions & observations with non-techs)
* FLOSS (& active) - I try to encourage people to make use of Open Source Software whenever possible, irrespective of their current OS.
* I've been asked for FLOSS backup solutions on a number of occasions, not all by Linux users, so the system, or at least an agent, needs to have supporte for multiple OS's (.deb, .rpm, win, mac, etc)
** GUI or WebGUI is a big selling-point (probably don't need it myself, but sweetens the deal for newcomers)
* Network support - a client-side agent is OK; not needing one is even better.
* The actual backups/archive need to make sense. I know that a few systems archives (& encrypts) data into monolithic volumes, but if a laymen user was to dig into the data to try & locate a file, it could be off-putting.
* Support for (encrypted) on-line/off-site backup, as well as removable disks (i.e. if they plug a 1TB USB HDD every Friday/weekend)

I've been looking into, but have not come to any conclusion, wrt a few systems: (just sharing my list)
* rsync & tar
** Grsync
** DeltaCopy
** Synametrics
** cwRsync
** grsync
** rsnapshot
* Déjà Dup
* Simple Backup & Restore
* Amanda & Zmanda
* Bacula
* Duplicity & Duplicati
* BackupPC
* Synkron
* TimeVault
* FlyBack
* Back-in-Time
* rdiff-backup
* Simple Backup


What I'd like to know:
* Which of these systems have community/forum members been using & what has your observations been?
* Do you know of systems that meet the aforementioned criteria (including any on the list provided)?
* Are there any excellent systems, that meet the criteria, that I do not have on the list?
* Any wisdom you'd like to share?

Any help & insights would be greatly appreciated.

- J

GATTACA 08-17-2012 08:30 AM

Are you 100% committed to FLOSS?

It sounds from your post that you are looking for a backup solution for a company (correct me if I'm wrong).
If you are dealing with company data, you can't risk its loss due to software problem.

I use Crashplan as a service http://www.crashplan.com/.
They have enterprise-level support if you are willing to pony up the $$$.

The nice thing about them is that you can backup all 3 major OS types: Winblows, Linux, MacOS X.
Also, you can access the data from anywhere and they take daily snapshots.

Just a commercial solution that works really well.
(just my 0.02)

workent 08-17-2012 04:51 PM

Hi GATTACA,

I try to make use of FLOSS wherever & whenever I can (with a few exceptions - mostly dogfooding for support purposes).
In principal, 100% committed to FLOSS, in practice, maybe not entirely (close to 99%)

My SoHo setup is mostly FLOSS, as I like to make use of the systems 1st-hand that I recommend to others.

Crashplan is a compelling solution, and is one of the hosted services that I can put forward to interested parties who need off-site backup.
Other services/solutions in that space I've encountered include: (alternativeto hase this good list)
* a co-lo server, partner/pier site
* Amazon
* Dropbox
* ZumoDrive
* ADrive
* Ubuntu One
* SpiderOak
* Wuala
* Jungle Disk
* TarSnap
* RetroShare

The remote/online/"cloud" stuff is only the last link in the chain (still an important one, granted) - it's the internal backup regimen & system that I'm looking at at present.

linux999 08-21-2012 03:35 PM

I use rsync for backing miscellaneous stuff and I use an encrypted USB stick for the more sensitive stuff

KenJackson 09-04-2012 09:45 PM

I haven't used any of those. I use a cron script which tars, encrypts and copies to my gaggle of USB sticks.

One reason I don't want to use an existing solution is that I expect there is a higher chance that it could be attacked. The more knowledge the enemy has, the more he can hurt you. But I do it all my own way, so attackers would have to shoot in the dark.

Another reason is that I enjoy writing shell scripts.

suicidaleggroll 09-04-2012 09:57 PM

I wrote a script to do incremental backups a while back. Basically the first time you run it it just copies everything over to your other drive. Whenever you run it after that first time, it compares each file you want to back up to the version in the previous backup. If they're the same, it hard links the file from the previous backup to the new backup. If they're different, it copies the file into the new backup.

The end result is you have a set of backups, one for each time you run the code. Each backup contains the active version of every file at the time of the backup, however the hard drive space used is only that of the files that changed since the last backup. Since they're all hard links, you can remove any backup you want without affecting the other backups. This will easily let you do say daily backups, then when a backup is more than a month old you can decimate them to weekly backups, then monthly, and so on...all without using any more hard drive space than a single backup plus extra copies of whatever files changed between backups.

workent 09-04-2012 10:06 PM

Quote:

Originally Posted by suicidaleggroll (Post 4772851)
I wrote a script to do incremental backups a while back. Basically the first time you run it it just copies everything over to your other drive. Whenever you run it after that first time, it compares each file you want to back up to the version in the previous backup. If they're the same, it hard links the file from the previous backup to the new backup. If they're different, it copies the file into the new backup.

The end result is you have a set of backups, one for each time you run the code. Each backup contains the active version of every file at the time of the backup, however the hard drive space used is only that of the files that changed since the last backup. Since they're all hard links, you can remove any backup you want without affecting the other backups. This will easily let you do say daily backups, then when a backup is more than a month old you can decimate them to weekly backups, then monthly, and so on...all without using any more hard drive space than a single backup plus extra copies of whatever files changed between backups.

This is almost identical to what I have in mind for my own setup, or rather the proposed one.

This seems to be the most logical & robust manner to keep incremental backups in a relatively accessible manner, without too much resource overhead.

Basically what I'm looking for is such a system implemented in a nice package/GUI that I can then suggest to non-tech's (& tech's) to keep their house in order.

tigger908 09-05-2012 03:08 AM

I'm using Lucky Backup http://luckybackup.sourceforge.net/ to put stuff on a remote server via ssh. Uses delta compressions and should allow automation though cron (I can't seem to get that to work however!).

JeremyBoden 09-18-2012 06:40 AM

I use rsync to an external disk periodically (using NFS) and
"Simple Backup" on a daily basis to an internal disk.

Habitual 09-18-2012 07:54 AM

Quote:

Originally Posted by KenJackson (Post 4772835)
...I use a cron script which tars, encrypts and copies to my gaggle of USB sticks....

I personally find mounting of our (internal) AWS s3:// buckets the easiest to "maintain". I use mysqldump for the entirety of our Zabbix database (historical data in every dump) and a combination of date-stamped.tar.gz with the DocumentRoot of every apache virtualhost (and applicable mysql db if one is installed for that same virtualhost). It's probably not ideal, but it works for me. :)

Quote:

Originally Posted by KenJackson (Post 4772835)
Another reason is that I enjoy writing shell scripts.

Ditto.

I have a Bacula Server installed but I quickly got away from that when I was disappointed by the "proprietary format" of the archive.

I can 'tar zxf *.tar.gz faster that I can re-install and re-configure bacula, so I only have it sitting there. I had issues using it to grab some SQLExpress dbs on Windows hosts, where I didn't want to enable any Shadow Copying Services (it's not my windows server) and the server is on a closed and proprietary grid environment.

I don't know if that helps but there are going to be many replies to your inquiry, and probably just as many creative solutions.

You sound prepared, so you should land on your feet. Good luck!

chrism01 09-19-2012 07:41 PM

Quote:

I have a Bacula Server installed but I quickly got away from that when I was disappointed by the "proprietary format" of the archive.
For anyone else who's worried about that issue, Amanda/Zmanda uses native tools in the background :)

jefro 09-19-2012 07:47 PM

I keep wanting to try Fog.

I still use G4U a lot. Guess I could just use dd over ftp or nc.

A real time or live state backup in open source might be an idea.


All times are GMT -5. The time now is 01:57 PM.