SlackwareThis Forum is for the discussion of Slackware Linux.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I've gotten on a bit of a backup kick recently (started just before I found out about World Backup Day, ironically), and I've been obsessing about methods and hardware to the point I'm driving myself a bit nuts. These backups only need to be read by Slackware, so choice of filesystem is a bit moot: I've been using EXT4 flash media and external drives. Also, to make this clear, this is for off-line backups, not something like rsync to a remote location or similar.
First, a non-Slackware hardware question: what media to you guys prefer for long-term archival storage? The backup obsession in question started when I discovered my USB flash drive (Patriot XT 32GB--I have no problem identifying the guilty!), which was SUPPOSED to have a 10 year retention, had 8 EXT4 4kB blocks corrupted! Luckily I had another copy of the files elsewhere. Since flash for long-term storage is out, that leaves optical media and hard drives (I'm not rich enough or have enough data to think about tape). And if you say hard drives, then bare (plug into a USB to S/PATA adapter) or in enclosed (with its own case and adapter)?
Next, Slackware-specific, "dar" seems like it was custom-made for this kind of thing, even including hashes to make sure things don't get corrupted, yet it isn't included in stock Slackware--any particular reason? In the past I've always just used tar and split with some compressor for non-media files, but I realize now that, without some kind of ED/C (error detection and/or correction) things could get corrupted between the internal drive and the external one or vice-versa (there seems to be enough on the media itself).
Oh, one final thing: I didn't know it before, but it looks like BTRFS has similar checksum/hash ability as ZFS--would you consider it safe and stable enough to use instead of EXT4 now for this purpose?
Thanks for any and all help!
Last edited by storkus; 04-14-2012 at 09:31 PM.
If you are burning to optical media and the data is important, then burn at least two copies. I have had problems with disks that were burned 3-5 years ago now having read errors in some sectors. You have a much better chance of reassembling the original data using a tool like ddrescue if you have multiple copies.
Oh, yes, I intend to make multiple copies, probably to multiple media types (both optical and hard drive). My questions basically revolve around:
1a. If "dar" is so awesome, why is it still only a slackbuild and not in stock Slackware? Is something wrong with it?
1b. Are the advantages enough over tar to go through the effort to install and use dar?
2. Is handling bare drives dangerous (from static, dropping on the table, etc) enough not to use them for backups with a dock? (That is, insert into dock, copy files with dar, tar, etc, remove from dock, put in anti-static bag, put them into permanent storage location.) Or should I stick with portable drives in cases with their own interfaces?
3. Has optical media gotten any better, especially blu-ray? Or should I just not bother and only do hard drives?
4. My observation that flash media is NOT for long-term backup (> a year or two), regardless of what the manufacturer says!
what media to you guys prefer for long-term archival storage?
I use Freecom USB hard-drives, with external electricity supply. They are slow as hell in writing, but survive for long with no trouble. I have had Freecom USB drives working for more than four years, donated them to an animal protection organization and, as far as I now, they still work flawlesly for them top this date. My recommendation is to discard them periodically in order to avoid data loss because of drive aging. Old units can be easily sold, just ensure the buyer is not able to retrieve data from your drives.
For small backups, a bunch of DVD is nice, cheap and disposable, but I much prefer Hard Drives. If you have to manipulate a big amount of data, DVD are no convenient any more, at least in my opinion. It's kindda a Holy war, don't take it seriously.
I have no experience with blue-ray, but I think they are slower than other media, and cost does not help either.
"dar" seems like it was custom-made for this kind of thing, even including hashes to make sure things don't get corrupted, yet it isn't included in stock Slackware--any particular reason?
I have used Dar in the past and it's Ok. I guess the reason why it is not included in Slackware is that there are many other good apps already included (rsync, Tar...) and you can install Dar whenever you want anyway.
In the past I've always just used tar and split with some compressor for non-media files
No need to. Tar has self splitting capabilities. I have managed to record big backups (+300 Gigas) in DVD automatically with scripts and some pipes. Dar may be better for DVD backups, but Tar can do it on the fly if you use your head.
Just for the information of the readers, I use Tar for my offline backups. I perform periodic Aide checks on the system and critical /home files in order to discover corruption before backing up, then boot my secondary OS (my heavily modified Knoppix, frugally installed) and Tar the whole system in an encrypted drive. Tar supports differencial backups, and I use many drives to have redundancy (if a drive fails, I have another one with another recent backup).
Whatever backup system you chose, take care. Many don't handle metadata accurately and won't make for good full-system backups. Tar and Dar should suffice if used carefully. Which one you should use depends on your preferences. Dar seems easier to use in some scenarios, while Tar is better supported (you could pick up ANY Live distro to restore a Tar backup).
Last edited by BlackRider; 04-15-2012 at 04:17 AM.
par2 can be used to generate files of error detecting and correcting data. When used with backup media they can be used to check that the backup files have not been corrupted and, if they have, to fix them (up to a limit, of course).
par2 is pretty much essential when using optical media for backup (which a painful solution -- slow and troublesome) and prudent for other media.
par2cmdline is available as a SlackBuild.
AFAIK there is no media which is suitable for long term archival without periodic refreshing (error detection and fix). Duplication helps. Keeping the archival media in a freezer might help too?
Error correction records can be a pain to keep if the amount of data you handle is big, but are good to consider. However, I consider true duplication to be better, if you have a good backup policy. Remember that error correction records have to be stored somewhere, and can get corrupted too.
AFAIK there is no media which is suitable for long term archival without periodic refreshing
So think I.
For DVD error correction, there is DVdisaster, which can help too.
re: tar and split, I think I was thinking of using split on other things in the past like backing a single large file onto a CD. This being many years ago, I don't think I knew about par and friends at the time, though I did know about md5 and just didn't use it.
FWIW, I didn't even know about par until I started looking around--I stumbled onto it last. Pretty pathetic of me having been using UNIX/internet since 1989 and not knowing about it!
re dar vs tar vs par vs dvdisaster: dar seems to do par's capabilities of EDC, with far better knowledge of disk (random block) based backup than tar, which assumes sequential media. And dvdisaster only works on optical media, unfortunately. An alternative is ZFS or (maybe?) BTRFS instead of dar/par/dvdisaster, but I think that's only a checksum and not true ECC--and besides, ZFS isn't ready on Linux yet and BTRFS is perhaps still iffy (still no fsck yet, but they're really close).
One thing none of you commented on is the bare vs enclosed hard drive question. I don't have a lot of data right now, and there's a place that I can buy 6 80GB IDE drives for $49 close by: burn an identical copy on each one and distribute them to different locations, and it's a lot cheaper than even DVD (33 cents/disk) or BD (95 cents a disk) backup. The big worry, though, is handling and static.
Oh, and Catkin, I know from first-hand experience that hard drives don't like cold, at least not Western Digital ones: I have a friend who, a few years ago, put a brand new WD drive into a computer in the warehouse and discovered it wouldn't boot the next morning. I came over and quickly discovered it wasn't spinning up (though trying). I noticed how cold it was (it was winter and there's no insulation in there) and tried warming it up with my hands, finally getting it to spin up. He got his data off and returned the drive (though there may have been nothing wrong with it). My guess? Either fluid bearings freezing up, moisture inside jamming stuff up, or metal contraction causing a problem.
Of course, all of those to this day are only guesses... As for optical media, I don't think anyone really knows.
There's a place that I can buy 6 80GB IDE drives for $49 close by
If they are second hand, they have had some of their useful life consumed. Remember that every drive dies of old age (even if you only store it in a climatic chamber and never use it, data retention of the disk itself won“t last more than 20 years or so).
There was a study somewhere that showed that hard disks are likely to die after 4-5 years of intensive use, or during the first months (the latter due to undetected factory defects). I have read suggestions to use only drives of ages between 1 and 4 for critical uses because of this. That said, I have owned many drives and most of them survived much more than that.
As for optical media, I don't think anyone really knows.
Manofacturers provide working specifications for the media, and temperature limits for working and storage. Even the optimal temperature for tape conservation is supposed to be known by the manofacturer.
Last edited by BlackRider; 04-16-2012 at 11:29 AM.
Distribution: Linux From Scratch, Slackware64, Partedmagic
With the cheapness of TB usb drives now ( about £80 ) I use two which I alternate and keep unplugged except when dumping/restore'ing, I use the dump/restore programs, which although old are available on just about every distro and can dumpout and restore a complete working system which as I am a bit of a 'fiddler' I need as I have been known to hose my system periodically, I also have a 1G boot partition with pmagic installed on for regular backups and file fixing ( fstab,rc.S etc ).
Distribution: Debian Wheezy/Jessie/Sid, Linux Mint DE
Re cost: A common hard drive always costs around $100,-. This has been so for the past 6 or 8 years. Just the capacity has increased. No need to buy a cheap drive. You might save $20,- but you have no idea where the drive comes from. (Obsolete surplus stock??)
Re media: Wouldn't trust my /tmp folder to optical media. Apart from capacity, I had I don't know how many failing. Of all brands, writers, readers and ages. I use external hard drives. Swap them regularly and store the unused one(s) off-site.
Re backup method: after tarring, starring, zipping and whatnot I use rsync. Granted, it doesn't work too well for father/son/grandson schemes. But the cost of 7 drives in a daily schedule and 1 drive/month for long term storage if negligible if you take data storage seriously.
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
This is a problem that I've been dealing with for quite a few years -- and I've not found a solution that I can trust.
For long-term storage; i.e., greater than 10 years, nothing I have or have tried has proved reliable. I have commercially produced DVDs (movies) and CDs (audio) that are unreadable after five years, CDs and DVDs that I've burned that are unreadable from as little as two years ago, USB sticks that are gone to the great byte bucket in the sky after as little as a year. I've tried so-called archival optical media (that cost an arm, leg and two toes from the other foot!) that have given up the ghost at five years. I do have some 9-track tapes that are over 20 years old that I can still read, however (and a couple that are gone, too). Got a couple of old IDE hard drives sitting on the shelf that are still readable but who knows how long there will be a disk controller that will read the things (remember serial jacks on the back panel of your computer -- seen any lately? How about parallel plugs?). I have a few tape cartridges that I can still read but who knows when the SCSI interface to those will be gone.
I've tried stashing stuff in a safe deposit box at the bank -- the vaults are usually temperature- and humidity controlled. Ten years, most, not all, but most, are gone. Back in the day we stored 9-tracks in a salt mine, most of those are still all right from what I hear from an old pal still maintaining that stuff.
Remember LaserDisc? Got a player, got a bunch of movies, all of them have laser-rot (which makes them unwatchable). Recorded a bunch of test data on those things, too. All gone.
Tis a puzzlement.
I would note that I've stood in the Bodleian Library at Oxford holding a 600 year old book in my (gloved) hands and could read every word. That tells me that if I really, really care about it put it on acid-free paper and store it somewhere cool and dry.
Optical media has gotten better and will, most likely, continue to improve. But, big but here, will the hardware be available to read the stuff? When's the last time you saw a cassette tape deck in a store (or in the instrument panel of an automobile)? Lots of folks were recording data they cared about on cassettes, what, 10 years ago? How about 8-inch double-side, double-density floppies (I've got a few boxes of those)? How about 5-inch floppies or even the 3-inchers? Got any of those, can you read them?
Of course there's always the "cloud." But a computer's memory is only as long as its power cord and who knows what evil lurks out there in the cloud.
You best bet would be archival CDs, DVDs and, maybe, Blu-ray multiple copies stored separately and carefully. But will your great grandchildren be able to view them... who knows. What utility you use to record them ain't really all that relevant (tar may still be around in 20 years, it's been around for at least 30 years that I know of) as long as it's simple and not some Next Great Thing (Blu-ray versus... uh, what was that other thing?).
Distribution: Debian Wheezy/Jessie/Sid, Linux Mint DE
Originally Posted by tronayne
Got a couple of old IDE hard drives sitting on the shelf that are still readable but who knows how long there will be a disk controller that will read the things (remember serial jacks on the back panel of your computer -- seen any lately? How about parallel plugs?).
Absolutely true. As I stated it, it might not have been fully clear that back-up disks should be replaced/renewed/upgraded every few years. I do not store my data on a hard disk and put it on a shelf in case I need it. Hard disks wear out, even on a shelf. The idea is that data is always kept live on the internal storage, and on the backup storage. As the internal storage gets upgraded, so you need to upgrade the external storage. Like keeping a virus alive in the lab. (Bad analogy!)
Having said this, in my home server the file structure (of the user files) is still descending from the very first server I ever built in 1996, and that file structure was copied from my very first 80286 home computer with a 20 MB hard disk. This structure was not recreated, but copied from one storage system to the next during the past 25 years. Backups have evolved from 1.2 MB floppy disks, thru various tape systems (ZIP, QIC from 250 MB to 10GB, DDS-1, DDS-2 and DDS-3 and LTO) to external USB disks at present.
It should also be noted that all my tape systems were eventually were decommissioned due to reliability problems.
Thanks for all the extra comments, guys! Most everything you've said echos what I've seen googling around, and that's what worries me. For short-term/incremental backups, rsync and friends are obviously the ticket; what I don't know is long-term--that is, what software to use and what "keeps", to which the answer seems to be "nothing". Sigh. It almost seems it's actually better to run one or more NAS boxes with a lot of drives in RAID-6 or mirrored JBOD and replace them as needed. Again, I don't have a lot of data to save (well under 100GB at the moment) but I do want to save it long-term.
Again, thanks for the advice. It seems the consensus is not only to back up often but to check your backups often as well!