LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 04-04-2009, 02:40 PM   #1
WojtekO
Member
 
Registered: May 2006
Distribution: CentOS 5
Posts: 47

Rep: Reputation: 15
Using tapes for long term data storage. What software to use? Not Amanda.


So the company I work for recently purchased a Dell TL2000 tape box.
They then decided to use Amanda to interface with that machine.

The then system admin installed it and soon afterwards quit his position. Then I got hired (not his position, I'm IT support) and besides my regular duties, got put in charge to make amanda work for us.

After a few weeks of trial and error, chatting on irc and mailing lists, I came to the conclusion that Amanda is not for us as we are not planning to do 'regular backups' as per amanda's definition.
We only need to archive files for long term storage. Tapes will never be overwritten, just accumulated and new ones purchased.

We plan on dumping about 200Gb / week.
With amanda, the problem was that it is not able to append 'sessions' to the same tape, it writes a new one on every run (waste of tapes for our purpose). We could tell it to hold the files in it's 'holding disk' until it accumulates 800Gb and then it writes them. The problem with that scenario is that we'd need to buy a few extra hard drives to build a reliable RAID array for that holding disk. 3 projects at a time = 2.4TB of raid'ed space required.

All this just because amanda cannot write multiple sessions to the same tape.

So my question is, what would be a good software that could accomplish what we need:
- Be able to write multiple times to the same tape until it's full (!)
- Change tapes when the current one becomes full
- Keep a database (plain-text or mysql) of which files were written to which tape
- Skip file if it's already in the database (optional)

That's it basically. No nonsense

Any input greatly appreciated.

Last edited by WojtekO; 04-04-2009 at 03:17 PM.
 
Old 04-05-2009, 11:50 AM   #2
choogendyk
Senior Member
 
Registered: Aug 2007
Location: Massachusetts, USA
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,189

Rep: Reputation: 105Reputation: 105
On the Amanda Users list, you indicated

Quote:
Every week, we'll be dumping about 100Gb of files categorized in
folders, Amanda will then backup those files to tape, and then the
files will be deleted from that folder. Process repeats every week.
Could you explain a bit more about the process that does that initial dump? It seems that's a significant part of the larger picture.
 
Old 04-05-2009, 11:53 AM   #3
reptiler
Member
 
Registered: Mar 2009
Location: Hong Kong
Distribution: Fedora
Posts: 184

Rep: Reputation: 41
I never played with Amanda, but an alternative worth considering might be Bacula.
 
Old 04-06-2009, 01:11 PM   #4
WojtekO
Member
 
Registered: May 2006
Distribution: CentOS 5
Posts: 47

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by choogendyk View Post
On the Amanda Users list, you indicated



Could you explain a bit more about the process that does that initial dump? It seems that's a significant part of the larger picture.
The source will have the following folder structure:
Project1, Project2 -> 2007,2008,2009 -> Jan,Feb,[...],Dec -> 1,2,[..],31

A folder with years containing all months containing all days

Files would be dumped into the appropriate folder where they belong.

amdump Project1 would backup *every new file* under Project1.
Once amdump would complete, we would then delete all files in the source to save on disk space (but keep folder structure)

During the course of the week we'd fill up the source with files and run amdump on the weekends. And repeat every week.

That's how I planned to do the long term backuping.
 
Old 04-06-2009, 06:07 PM   #5
choogendyk
Senior Member
 
Registered: Aug 2007
Location: Massachusetts, USA
Distribution: Solaris 9 & 10, Mac OS X, Ubuntu Server
Posts: 1,189

Rep: Reputation: 105Reputation: 105
Where do the files come from? And how is it that they are put in a folder and land in a particular folder? Not trying to be dense, just trying to understand the process behind the scenes that is being backed up.
 
Old 04-07-2009, 02:10 AM   #6
WojtekO
Member
 
Registered: May 2006
Distribution: CentOS 5
Posts: 47

Original Poster
Rep: Reputation: 15
Let's call them daily reports which are generated by some scripts. Say 10-20gb per day.
Everyday they're put in that day's folder /files/YYYY/MM/DD/ on the main server.

Now, this process was taking place everyday for the last 2 years and there never was a tape archiving solution.
The result was that every time the main server started to get full, a chunk of those reports were 'temporarily' moved to some secondary servers only to be forgotten there, creating a huge mess I now have to clean

Those files on the secondary servers would have to be manually moved back to the main box in the appropriate date folder where they belong so the tape software could write them to tape.
(I didn't want to archive them from their current location as that would imply multiple amanda DLE's and would imply searching multiple DLE's afterwards if one of those files was needed. That's why I wanted to centralize everything back on the main server to have 1 DLE with a logical folder structure underneath)

The goal is basically to a) clean all the secondary servers of those reports and b) automate the archiving of future files to prevent this from happening again.

The way I saw it is that after moving the files from secondary to main, I'd launch the amdump process which would write what's in main's DLE, then wait till it finished and deleted the files from the main.
Repeat this process until all secondary are clean, and then setup some kind of script to automate a weekly arching of future reports older then 3 months.


3am here, does the above make sense?

Last edited by WojtekO; 04-07-2009 at 02:25 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Amanda backup issue, taking too long now changlinn Linux - Enterprise 11 04-07-2009 05:30 PM
Intel vs AMD long-term jmhet42 General 15 01-03-2009 09:10 AM
Best distribution for a Long Term server. Strider_Max Linux - General 18 08-18-2008 01:08 PM
Advice: RAID or software data storage? MicahCarrick Linux - Hardware 4 01-11-2008 05:18 PM
to all long-term *nix users jon_k Linux - General 3 07-31-2003 10:17 PM


All times are GMT -5. The time now is 01:23 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration