LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 10-20-2009, 07:42 PM   #1
BrianK
Senior Member
 
Registered: Mar 2002
Location: Los Angeles, CA
Distribution: Debian, Ubuntu
Posts: 1,334

Rep: Reputation: 51
need to rsync only selected files (--files-from) also need to delete files on dest. ?


I'm working on an offsite backup.

To minimize the amount of data transfer, I've written a script that scours ~100TB of data & grabs a long list of files it needs to backup.
If I then rsync using --files-from, this method works well.

The problem is, that list will change daily & many files that were backed up yesterday will be deleted today & therefore should be deleted in the offsite backup as well. How do I accomplish this? I've tried --delete-excluded in combination with --delete, but I'm not seeing any files being deleted on the destination side.

ideas?
 
Old 10-20-2009, 11:22 PM   #2
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
Why are you doing rsync's work? Rsync will find the files that need backed up and only backup the portion of the file that changed if possible. Note the -z option to compress the network traffic.

The --delete option should work. Your homemade lists may be causing a conflict.
 
Old 10-21-2009, 12:42 AM   #3
BrianK
Senior Member
 
Registered: Mar 2002
Location: Los Angeles, CA
Distribution: Debian, Ubuntu
Posts: 1,334

Original Poster
Rep: Reputation: 51
Oh, I'm well aware of how rsync works (well, aside from this particular issue). At the end of the day: because I can't rsync 100TB of data.

Furthermore, most of the data on disk are result sets - I don't need to backup the processed data, I need to backup the files that create the processed data... In the event of a fire or an earthquake (this is southern California, after all ), I can pull a minimal backup back online, set several hundred processors to work, and have everything back to the way it was in the matter of a day or three.

Why not just rsync the whole 100TB and let rsync figure out the differences? Two [main] reasons:
1. I would need another 100TB at a co-lo facility. Cost of hardware + cost of rack space + maintenance on that many spindles is prohibitive.
2. My company generates several hundred gigs to possible 1TB or more of data per day. Transfering that much data would take entirely too much time & cost entirely too much.

It makes little sense to backup data that can easily be regenerated. It makes a lot of sense to backup files that generate other data (which = money). So I have all these fancy scripts that find the generating data files... now I need to back them up.

That was probably a much longer explanation than you were interested in. heheh
 
Old 10-21-2009, 05:04 PM   #4
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
OK

I'm thinking since you've committed this much effort into optimizing, rsync your list of files to the coloc, add a date stamp field the records, then at the coloc, delete the files from the list after N days.

I've managed to delete files off of 3 machines within minutes using the --delete option, with a little stupidity and bad timing.

Swim swim...
 
Old 10-21-2009, 07:01 PM   #5
BrianK
Senior Member
 
Registered: Mar 2002
Location: Los Angeles, CA
Distribution: Debian, Ubuntu
Posts: 1,334

Original Poster
Rep: Reputation: 51
Interesting idea.

I'd like to think rsync can do what I'm after & I'd rather not rely on crons on both ends. However, that is certainly a route to a solution. If all else fails, I'll go that route. Thanks for the suggestion.

Does anyone know if rsync actually does what I'm after - Delete files not in a -files-from option?

I see there's a -include-from that looks at patterns & works with --exclude-from... maybe there's something there. hmmm....
 
Old 10-22-2009, 09:52 PM   #6
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
Pardon the tangent. This is a working example:

rsync -azv --delete --recursive dutyman:/usr/lib/basic/ usr/lib/basic >>$LOGMSGFILE 2>>$LOGMSGFILE || RSYNCOK="N"

This drops the file in the target directories that no longer exist in the original.

Perhaps this will help you create a micro model, before you unleash on your macro del mundo system.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Can I delete files in /mnt/tmp? and Files in the trash can will not delete? M$ISBS Slackware 15 10-02-2009 11:56 PM
how to delete *~ files created by rsync cyrilng Linux - Software 5 06-23-2009 01:04 AM
Based on what condition, will rsync delete files? edenCC Linux - Software 6 04-09-2009 01:21 AM
rsync not delete non existing files on remote server proNick Linux - Software 2 09-10-2008 03:47 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 06:30 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration