LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 05-21-2012, 10:58 PM   #1
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Rep: Reputation: Disabled
Create TAR archives to two different LTO-5 drives at once


I need to be able to write a tar archive to two SAS LTO-5 tape drives simultaneously. I work for a post-production house where we have to create redundant tape backups from large image sequences. We are typically backing up 1-3TB of data a day, which is nearly impossible to do using only one LTO drive and swapping tapes.

My first thought was to send the tar archive to STDOUT and then pipe the that data into the tee command to write multiple files at once. This works great writing to a hard drive, but I haven't been able to figure out the best way to write to the tape drives via tee.

Here is what I've tried:
tar -cvO /sequenceDirectory | tee (dd of=/dev/nst0) >(dd of=/dev/nst1)

I was hoping to not have to create a temp .tar file on the harddrive before writing to the tape. Please let me know if anyone has a solution to this. Also, is there a way to verify that all the data in the directory was successfully written to the tape?

Any help is greatly appreciated!
 
Old 05-22-2012, 07:50 AM   #2
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
Why not just write to one tape drive then do the tar from that drive to the other one? No need to create an intermediate file on disk.

Have you looked at backup solutions like Bacula (OpenSource) or NetBackup (Commercial) to manage this for you?
 
Old 05-22-2012, 10:23 AM   #3
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Hi MensaWater,

Thanks for your reply. This biggest issue with writing to one drive first and then copying that to the other is the time it will take to get that data over to the 2nd drive. Both tape drives will be tied up twice as long. Basically, I could achieve the same thing with one tape drive. I do however see the benefit of being able to deliver the source drive back to the production company sooner going tape to tape.

Another reason I was thinking about creating the temp tar first would be for write speed reasons. I'm lucky to see 20-25MB/sec going from the source eSATA drive directly to tape. This is because we are writing thousands of small files (around 7MB each) instead of fewer larger files. This doesn't allow the tape head to spin up to max speed before having to wind down. Copying larger files I see speeds up to 100MB/sec or more. I am thinking that creating the larger temporary .tar archive onto our RAID and then copying to both tape drives from that temp file at once will be the fastest way to get the job done.

I briefly looked into the Bacula software that you mentioned, but at first it appeared that it was designed to do routine backups on a regular basis. The data we are backing up will be coming in from multiple sources. My concern would be having to reconfigure the backup software every time we get a new batch of footage in. Any insight you can provide on how these applications work is greatly appreciated.
 
Old 05-22-2012, 10:39 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,930

Rep: Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321Reputation: 7321
is it possible to access both tapes simultaneously?
 
Old 05-22-2012, 11:02 AM   #5
mesiol
Member
 
Registered: Nov 2008
Location: Lower Saxony, Germany
Distribution: CentOS, RHEL, Solaris 10, AIX, HP-UX
Posts: 731

Rep: Reputation: 137Reputation: 137
In case of writng ten thousands of small files to tape think about a temporary staging area. As you already mentioned larger files will speed up performance.
 
Old 05-22-2012, 11:30 AM   #6
lithos
Senior Member
 
Registered: Jan 2010
Location: SI : 45.9531, 15.4894
Distribution: CentOS, OpenNA/Trustix, testing desktop openSuse 12.1 /Cinnamon/KDE4.8
Posts: 1,144

Rep: Reputation: 217Reputation: 217Reputation: 217
Hi

create a TAR on your hard drive and then copy to both tape drives
 
Old 05-22-2012, 04:45 PM   #7
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Thank you all for the input.

I am currently doing speed tests creating tar files first and then going to both tape drives at once. I will post up my findings once the tests are completed.
 
Old 05-29-2012, 05:17 PM   #8
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Okay, so I created a tar archive of one sequence of images. This archive is 39.2GB in size. I am still not seeing speeds anywhere near 100MB/s. Currently I can only get write speeds around 17MB/s (roughly 1GB/min) using the following command:

pv -B 4m -tbr /archive.tar | tee >/dev/nst0

I have tried several different buffer sizes ranging from 512 bytes to 4GB in the pv parameters. This has not effected the write speeds at all. Running the same command, but writing to the same hard drive that I'm reading from I am getting an average of 120MB/s.

Does anyone have any suggestions as to what the bottleneck might be?
 
Old 05-29-2012, 08:13 PM   #9
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,783

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
The buffer size that pv uses does not affect the buffering in the tee command, which is probably using the default 4K block size. You would need to run the output through something like dd to reconstruct the larger blocks:
Code:
pv -B 4m -tbr /archive.tar | tee | dd bs=4M iflag=fullblock of=/dev/nst0
or, for two drives:
Code:
pv -B 4m -tbr /archive.tar | tee >(dd bs=4M iflag=fullblock of=/dev/nst1) | dd bs=4M iflag=fullblock of=/dev/nst0
Keeping a fast tape drive streaming at full speed can be quite a challenge. Back in the days of slower disks and processors, even with a slower tape drive (DDS2) I found it necessary to write a circular buffering program in order to keep the pipeline flowing and the tape drive streaming. The script to set up that pipeline and handle all the places that errors could occur was rather horribly complex.
 
1 members found this post helpful.
Old 05-30-2012, 03:28 PM   #10
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Is there any alternative to using the dd command? It spits a bunch of gibberish out in my terminal window and my throughput goes down to 182Kb/s using it.
 
Old 05-30-2012, 07:10 PM   #11
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,783

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
Getting rid of the junk from dd is simple, just redirect its stderr to /dev/null, or better, to some file you can examine later should there be some error. But that isn't going to help your throughput problem. The only thing I can guess at this point is that there might be some issue in tee w.r.t. blocking I/O and block sizes larger than the size of a FIFO buffer. I fear I've about reached the limit of what I can suggest. If I were fighting a problem like this on my own system, I'd no doubt be writing code by now.

I did just think of one thing, though. Since you've now got the archive on a file, there's no reason you couldn't start up two independent processes to read from the file and write to a tape drive. That way there would be no need to use tee and pipelines.
 
Old 05-30-2012, 07:55 PM   #12
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Thanks for the input. I had a suspicion that tee might be the culprit. I will try bypassing it tomorrow when I get to the office. I should be able to whip out a python script that reads in blocks of the file and sends them out to be written to the tape drive.

You mentioned that there were quite a few points of error when scripting this sort of thing in an earlier post. If you know of any gotchas off the top of your head, I'm all ears. Thanks again for all your help.
 
Old 05-30-2012, 11:35 PM   #13
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,783

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
Nothing specific w.r.t. errors, just that when you have a complex pipeline set up you need to be sure that no error in any stage will slip by unnoticed. The script that performed my backups wrote its output to the tape as well as keeping an online copy, included the aforementioned buffering to keep the tape streaming, calculated an MD5 sum of the data stream as it was being generated, had options for compression, maintained an index of where things were on the tape, kept track of tape usage, ... . Making sure the script didn't try to continue in the face of errors in any of that got pretty complex.

Makes me glad I'm not using tape for backup any more. 'Course the scripts I've go to work around issues with the backup tool I do use now are even worse, but that's progress for you.
 
Old 06-01-2012, 02:34 PM   #14
timmcart
LQ Newbie
 
Registered: May 2012
Posts: 8

Original Poster
Rep: Reputation: Disabled
Glad to report that bypassing tee did the trick! I am able to keep a consistent speed of around 90MB/s now, which is an amazing increase. I did a test taring directly to the tape and that went at about 65MB/s. I will not be the only person running these backups. I might create a GUI to help them create the .tar archives and to get them onto and off of the tape. If so, I will make the code available to help others with the same issues.

Just as a recap:
- Create your tar archive first so that maximum speeds can be achieved.
- Use pv and dd together to get your fastest write speeds and monitor the output. The LTO-5 drive seemed to like a block size between 384k and 1024k. Send the dd output to /dev/null to prevent a bunch of stuff being written to the terminal window
pv -B 1024k -tbr your_archive.tar | dd bs=1024K iflag=fullblock of=/dev/nst0 > /dev/null
- Pull the data off of the tape by using the mt command to position the head at the appropriate block (use mt manpage to find appropriate command) and then using dd to pull the data off
mt -f /dev/nst0 bsf 1
dd bs=1024K if=/dev/nst0 of=/your_dir/your_archive.tar > /dev/null

Thanks everyone for your help!
 
1 members found this post helpful.
Old 05-15-2013, 10:30 AM   #15
Moouton
LQ Newbie
 
Registered: May 2013
Posts: 1

Rep: Reputation: Disabled
Hi,

i also work in a post-production environnement and i have to create LTO-5 for one of my teams.
Like yourself, i would like to create simultaneously 2 LTO tapes with the same contents (one for each archive sites we have).
I read carefully the previous posts, and it was very interesting.

At the end, it seems that you find the correct settings, but i didn't understand if you managed to create both LTO simultaneously, or one by one.

I tried different things with pv and dd... and i did managed to write files on one LTO at 186MB/s with your command line, but i didn't understand how to run it on both LTO drives.

Could you help me?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Script to find tar archives, read tar file contents, output content to an index file. bluesword1969 Linux - General 4 02-07-2011 12:15 PM
tar backup to LTO-4 tape drive wazzu62 Linux - Hardware 2 08-19-2009 10:07 AM
LTO compressed backup using tar in RH Linux ppanyam Red Hat 1 12-07-2007 03:14 AM
dd and LTO tape drives with seek kernel bug jahearne Linux - Hardware 0 08-30-2007 04:33 PM
LXer: Create and edit TAR archives dynamically with PHP and PEAR LXer Syndicated Linux News 0 02-23-2007 09:31 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 01:46 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration