Download your favorite Linux distribution at LQ ISO.
Go Back > Forums > Non-*NIX Forums > Programming
User Name
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.


  Search this Thread
Old 07-24-2007, 04:43 AM   #1
Registered: Feb 2006
Location: Belgium
Distribution: Debian
Posts: 84

Rep: Reputation: 15
bash backup script

Hi all,

I need to backup 2 TB of data on 4 external hard disks of 500GB that are connected via USB2.
Everything that needs to be backupped is under /data.
My first thougth was to create a big tar file of all the files in /data an split that tar file in pieces of 499 MB using split.
but the problem with this approach is that I cannot afford 1 external disk to die because then the whole tar file will be useless.
And also the backup script should use rsync and rsync + tar doesn't go so well I suppose?

So now I'm thinking to create a script that goed through all files under data until it has 499 MB of files and write that to the first external disk, after that the script continues with the next files until it has again 499MB and writes this to the second external disk, and so on.

But I don't know how to start with this.
Can anybody help me to get going with this?

Old 07-24-2007, 06:32 AM   #2
Senior Member
Registered: Dec 2005
Location: Campinas/SP - Brazil
Distribution: SuSE, RHEL, Fedora, Ubuntu
Posts: 1,397
Blog Entries: 1

Rep: Reputation: 64
Can you put the 4 disks under LVM ? So you get one big disk of 2TB. After that, is simple (sort of) to transfer 2T from primary disk to 2T in the backup disks in you pass.

The main problem is if any of 4 disks has a problem, them you loose all data in the LVM.

The other problem is the bandwidth, time, and CPU power you need to make this synchronization. I'm afraid as soon the data is transfered to backup disk it is outdated.
Old 07-24-2007, 06:36 AM   #3
Registered: Jun 2006
Location: Mariposa
Distribution: Slackware 9.1
Posts: 938

Rep: Reputation: 30
My approach would be this:

Make a list of all files that will be backed up, along with each file's size. For each file, the size should include not just the size of the actual file, but any overhead that tar includes per file. To determine that overhead (you don't need to be exact, but the closer the better), run a few tar experiments, and examine hex dumps of the output files.

For each time through, pick 499MB worth of files, tar them, and send them along to the appropriate disk.

I'm thinking that's easier than keeping track of the size of the tar file as you go, and figuring out where to pick up on the next pass. Easier to split the files into passes before you start each pass.

Your approach is to build the 499MB tar data to standard output and send it on the fly, right? In that case, you're right; rsync doesn't look too promising. Have you considred nc6? If it's not on your system, it's available at

The "nc" stands for "netcat", and is useful for piping stuff through standard output on the sending system and receiving it through standard input on the other end.

If you're not familiar with bash scripting, google this:

bash script tutorial
and have yourself a read-fest.

Also, do this at the command line:

man bash        # of course!
man tar         # of course!
man od          # for dumping a tar file to determine overhead per file
man less        # you're not likely to fit the dump of a tar file on one screen
man nc6         # if that's the way you want to go
I'm sure someone will come along with a complete solution, but you'll have more fun if you do it yourself!

Hope this helps.
Old 07-24-2007, 06:47 AM   #4
LQ Guru
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Making very large tar files doesn't seem a good idea: have you considered the time necessary to restore one or more files from the backup? I would consider rsync or dump/restore (be careful) options, instead. Regarding the size-limit, dump could be the right choice (see man dump):
dump can detect end-of-media. When the specified size is reached, dump waits for you to change the volume.
Anyway, be careful: to dump a filesystem or part of it, it is highly recommendable to mount it read-only to avoid that any chunk of the block device will be updated during the backup process.
Old 07-24-2007, 07:33 PM   #5
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,357

Rep: Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367Reputation: 2367
Can you make the backup disks a raid set with parity?
Also, consider doing 450MB-ish and pipe through gzip as individuals or tar then gzip.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Need help with a bash script that does a backup jamtech Programming 8 06-07-2007 11:36 PM
Need help with my backup script written in bash! guy12345 Programming 14 05-25-2007 05:09 AM
Backup script in bash gauge73 Programming 13 10-17-2005 06:25 AM
Bash backup script failing using tar and --newer option saldkj Programming 3 03-12-2005 12:03 PM
bash shell backup script turnip Programming 0 04-03-2003 06:00 PM

All times are GMT -5. The time now is 06:43 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration