LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-01-2012, 09:30 AM   #1
otaviolb
LQ Newbie
 
Registered: Oct 2012
Location: Minas Gerais, Brazil
Distribution: Opensuse and Debian
Posts: 13

Rep: Reputation: Disabled
Double use of find


Hi!
I'm trying to figure out how to use "find" to solve my problem.

I had a directory tree on HD that was a mess. I backed it up on a DVD and fully rearranged under $HOME/newdirstruct, now much more organized. To do this, I moved the files using dolphin (opensuse 12.1), so that the old tree does not exist anymore in HD.

After using newdirstruct for some time, I realized that several files were corrupted. I simply copied them from backup DVD and it was fine.

I want no more surprises so I wish to copy all files from DVD to the new tree. I would like to do something like this:
Code:
$cd newdirstruct
$ find . -type f -exec 'find /media/backup_old_dir/ -name {} -exec cp --remove-destination {} {} \;' \;
I did not even try this because I do not know how to code that the first bracket pair {} belongs to the second find command, while the second bracket pair {} belongs to the first find command.

New files will be preserved. Changes in existing files will be lost, so I am trying to keep track on them to further update them.

I'm glad to read your opinions.
 
Old 10-01-2012, 10:19 AM   #2
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 6,674
Blog Entries: 14

Rep: Reputation: 1021Reputation: 1021Reputation: 1021Reputation: 1021Reputation: 1021Reputation: 1021Reputation: 1021Reputation: 1021
While you might modify find for this purpose a better idea might be to use rsync which is actually designed to replicate directories from one server to another but can be used within the server itself. It has many flags including those that tell it to only overwrite older files, overwrite smaller files and a whole host of other options.

Type "man rsync" for more details.
 
Old 10-01-2012, 10:47 AM   #3
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,332
Blog Entries: 55

Rep: Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533Reputation: 3533
Quote:
Originally Posted by MensaWater View Post
While you might modify find for this purpose a better idea might be to use rsync
The OP said he modified the directory structure ("fully rearranged under $HOME/newdirstruct, now much more organized") and unless my rsync-foo is that bad it won't handle that.

Now if you output all DVD files' hashes to file:
Code:
find /media/backup_old_dir -type f -printf "%s %T@ %p\n" | while read FSIZE FMODTIME FFULL; do 
 FHASH=($(md5sum "${FFULL}")); echo "${FHASH[0]} $FSIZE $FMODTIME $FFULL"; done > /dev/shm/backup_old_dir.ndx
then you can compare those hashes, modification time, size in bytes and file basename (if unique) with those of the files on disk:
Code:
find $HOME/newdirstruct -type f -printf "%s %T@ %f\n" | while read FSIZE FMODTIME FNAME; do 
 FHASH=($(md5sum "${FNAME}")); 
 grep -m1 "${FNAME}" /dev/shm/backup_old_dir.ndx | while read OLDHASH OLDSIZE OLDMTIME OLDNAME; do
 [ "${OLDHASH}" = "${FHASH}" ] # etc, etc
leaving you with, depending on how much was corrupted before or on transfer or edited in the mean time, a list of files that don't match a known hash which could narrow down the set of files to copy (again, it depends).

That's quite cumbersome and IMHO an easier way to narrow down the set, OK depending on how much files slash corruption you're talking about, would be to use md5deep:
Code:
md5deep -r /media/backup_old_dir > /dev/shm/backup_old_dir.md5
md5deep -r -X /dev/shm/backup_old_dir.md5 $HOME/newdirstruct
as "-X" would give you a list of files that don't match a known hash and therefore should be added to the list of files to copy.
 
1 members found this post helpful.
Old 10-01-2012, 11:08 AM   #4
otaviolb
LQ Newbie
 
Registered: Oct 2012
Location: Minas Gerais, Brazil
Distribution: Opensuse and Debian
Posts: 13

Original Poster
Rep: Reputation: Disabled
Thank you MensaWater for your comment.

Use of rsync would be nice and welcome. I have used it for syncing directories that have the same tree structure. In my case, however, I completely changed the tree structure. Many subdirectories were created as well others were deleted.
For instance ~/newdirstruct/a/b/c/foo.odt was originaly ~/olddirstruct/d/foo.odt

If I use something like:
Code:
$rsync -r olddirstruc/ newdirstruct/
the old, messed, undesired tree will be reconstructed. Without using --delete, ~/newdirstruct/d/foo.odt would be created, preserving the (possibly corrupted) ~/a/b/c/foo.odt. Using --delete, the newdirstruct is built with the same shape, I mean, the same tree, as olddirstruct.

I don't want the old tree, because I spent much time organizing the new one.

A further complication for rsync is the change of timestamp when I moved the files.

I think 'find' can dig both trees to tell me where a file is; 'rsync' compares trees and updates one of them.
 
Old 10-01-2012, 04:50 PM   #5
otaviolb
LQ Newbie
 
Registered: Oct 2012
Location: Minas Gerais, Brazil
Distribution: Opensuse and Debian
Posts: 13

Original Poster
Rep: Reputation: Disabled
Thank you very much unSpawn.

I generated the .ndx files as you suggested. I sorted them:
Code:
$cat //dev/shm/backup_new_dir.ndx |sort -n -k2 >sorted_backup_new_dir.ndx
$cat //dev/shm/backup_old_dir.ndx |sort -n -k2 >sorted_backup_old_dir.ndx
I manually cut the extremes (the first and the last lines, to exclude files not saved in the DVD or new ones). There are lots of mismatches (over 1000 I guess). Including files that apparently are not corrupted, they exhibit subtle differences in size, and therefore different md5sums. I am still checking what went on.
 
Old 11-07-2012, 02:56 PM   #6
otaviolb
LQ Newbie
 
Registered: Oct 2012
Location: Minas Gerais, Brazil
Distribution: Opensuse and Debian
Posts: 13

Original Poster
Rep: Reputation: Disabled
Double use of find

The response of unSpawn inspired me. I decided to make use of force brute approach: copy all files from olddirstruct to newdirstruct.

Just remembering: when I made newdirstruct, time stamps changed.

The problem now are the updated files, or new files. So I worked out the following:
1) Make an image file of newdirstruct, naming it "brandnew"
2) Copy all files from DVD (olddirstruct) to newdirstruct. There is an option for cp --preserv=timestamps
3) mount and rsync all files from brandnewdirstruct to newdirstruct. Update only if time file is newer.

I did not run the following code, just waiting for some comment on possible drawbacks.

Tks in advance.

Code:
#!/bin/bash
# Make image of new folder (-x options options ommited)
genisoimage --allow-lowercase -r -N -iso-level 4 -d -o /tmp/newdirstruct_freshbackup.iso /home/otavio/newdirstruct
# Make list of files containing both complete and abridged forms (-prune otpions ommited)
find /home/otavio/newdirstruct \( -path /home/otavio/newdirstruct/BibliotecaDigital -prune \) -o -type f -fprintf /tmp/new.ndx "%p %f\n"
# Make similar list of files with old directory structure, backedup in a DVD
find /media/backup_abr_2012 -type f -fprintf /tmp/old.ndx "%p %f\n"
# Start search 
cat /tmp/new.ndx | while read nomecompletonew nomenew;do 
                        grep -m1 $nomenew /tmp/old.ndx| while read nomecompletoold nomeold; do
                                cp --preserve=timestamps -v $nomecompletoold $nomecompletonew >/tmp/logdacopia
                                echo $nomecompletoold $nomecompletonew >> /tmp/listadearquivoscopiados
                        done
                done
# Now the corrupted files are supposed to be gone.
# Need now recover files that were updated from date of DVD burning to date
# A second DVD will not be burned: the iso image will be loop mounted and named "brandnew"
sudo mkdir -p /mnt/newdirstruct
sudo mount -o loop,ro,uid=1000,gid=1000 /tmp/newdirstruct_freshbackup.iso /mnt/newdirstruct
# Make list of files, now with brand new files
# Now timestamps are trusty
# Use rsync to update files accordingly
find /mnt/newdirstruct -type f -fprintf /tmp/brandnew.ndx "%p %f\n"
cat /tmp/brandnew.ndx | while read nomecompletobrandnew nomebrandnew;do
                grep -m1 $nomebrandnew /tmp/new.ndx| \
                        while read nomecompletonew nomenew; do
                                rsync -vut $nomecompletobrandnew $nomecompletonew
                        done
                done
# Cleaning
sudo umount /mnt/newdirstruct
sudo rmdir /mnt/newdirstruct
rm /tmp/brandnew.ndx
rm /tmp/new.ndx
rm /tmp/old.ndx
rm /tmp/newdirstruct_freshbackup.iso
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to use a float/double without using float/double keyword? geewhan Linux - Kernel 4 06-17-2012 09:19 AM
Double Quotes Inside Double Quotes youarefunny Programming 6 06-09-2010 11:21 PM
Double the desktop, not double the fun! bizshop SUSE / openSUSE 3 08-26-2005 01:22 PM
Double Up SpEcIeS Linux - General 1 08-26-2003 02:21 PM


All times are GMT -5. The time now is 04:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration