LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-20-2010, 09:57 PM   #16
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,654

Rep: Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965

Now I am sure someone has a better one liner for the find idea, but how about this:
Code:
CURRENT_LIST=current_$$.tmp
NEW_LIST=new_$$.tmp

find $BASEDIR -type f -iname '*.jpg' -printf "%f\n" > $CURRENT_LIST
find $SOURCEDIR -type f -iname '*.jpg' > $NEW_LIST

while read -r line
do
    mv $line "$DESTBASE/$DESTDIR"
done< <(grep -v -f $CURRENT_LIST $NEW_LIST)

rm $CURRENT_LIST $NEW_LIST
May need to play with the file types (ie .jpg)

Last edited by grail; 09-21-2010 at 12:16 AM.
 
2 members found this post helpful.
Old 09-24-2010, 07:44 AM   #17
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
I wonder. Was this solved already?
 
Old 09-24-2010, 11:08 AM   #18
GazL
Senior Member
 
Registered: May 2008
Posts: 3,480

Rep: Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016
I don't know, but I thought it was an interesting problem and got interested, so I had a go and I came up with this somewhat insane pipeline and not a single loop!

I wouldn't be surprised if there's some sort of subtle error lurking in here, but it seems to work (as long as your filenames don't contain a comma or a newline)

Code:
FROM='/tmp/from'
TO='/tmp/to'

sort -k 2 -t ',' <( find "$TO" -type f -printf '%p,%f\n' ) <( find "$FROM" -type f -printf '%p,%f\n' ) \
   | tr ', ' ' \0'| uniq -u -f1 | tr ' \0' ', ' \
   | cut -f1 -d ',' \
   | grep -v -e "^${TO}" \ 
   | tr "\n" "\0" | xargs -0r cp -t $TO
And a demo
Code:
gazl@nix:/tmp$ find $FROM
/tmp/from
/tmp/from/dir1
/tmp/from/dir1/file1.jpg
/tmp/from/dir2
/tmp/from/dir2/space newfile.jpg
/tmp/from/newfile2.jpg
gazl@nix:/tmp$ find $TO
/tmp/to
/tmp/to/dir1
/tmp/to/dir1/file1.jpg
/tmp/to/dir2
/tmp/to/dir2/space oldfile.jpg
gazl@nix:/tmp$ sort -k 2 -t ',' <( find "$TO" -type f -printf '%p,%f\n' ) <( find "$FROM" -type f -printf '%p,%f\n' ) | tr ', ' ' \0'| uniq -u -f1 | tr ' \0' ', ' | cut -f1 -d ',' | grep -v -e "^${TO}" | tr "\n" "\0" | xargs -0r cp -t $TO 
gazl@nix:/tmp$ find $TO
/tmp/to
/tmp/to/dir1
/tmp/to/dir1/file1.jpg
/tmp/to/space newfile.jpg
/tmp/to/dir2
/tmp/to/dir2/space oldfile.jpg
/tmp/to/newfile2.jpg
gazl@nix:/tmp$
 
1 members found this post helpful.
Old 09-24-2010, 11:28 PM   #19
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
Nice code there GazL. I also suggest that you use tabs or any character that is not part of a filename like | as delimiter so that the filenames that contain commas will not be incorrectly parsed. I wonder what is the real intention with multilevel directories though. Will the files be copied with the same directory structure or not?
 
1 members found this post helpful.
Old 09-25-2010, 01:34 AM   #20
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,654

Rep: Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965Reputation: 1965
Quote:
Will the files be copied with the same directory structure or not?
My understanding is that they will all be copied to a single destination file but then the user will be left to organise how they like.
 
Old 09-25-2010, 02:35 AM   #21
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
Quote:
Originally Posted by grail View Post
My understanding is that they will all be copied to a single destination file but then the user will be left to organise how they like.
I was thinking that the problem could arise when two unique files with the same filename exists in different source directories. It's fine though if it is what's really intended.
 
Old 09-25-2010, 05:01 AM   #22
GazL
Senior Member
 
Registered: May 2008
Posts: 3,480

Rep: Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016Reputation: 1016
My understanding was also that everything was to be dumped to the DESTDIR and hand sorted from that point.


It's a good point about the comma konsolebox, I agree \t would have been much better but I had my CSV head on and it didn't even occur to me. I'll remember that in future. thanks, good tip.

As for the duplicate files with different names idea, you could do something similar with sort/uniq if you included a md5sum of the file in the data to be sorted and sorted/uniq'd on that instead of the base filename. Of course, that would make the scanning process take much, much longer as all the files would have to be read through to generate the hashes.
 
Old 09-26-2010, 02:33 AM   #23
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
Hey GazL. It's always a fun to make suggestions when I can.
Quote:
Originally Posted by GazL View Post
As for the duplicate files with different names idea, you could do something similar with sort/uniq if you included a md5sum of the file in the data to be sorted and sorted/uniq'd on that instead of the base filename. Of course, that would make the scanning process take much, much longer as all the files would have to be read through to generate the hashes.
Or I think it's simpler if they were put to destination directories with names that are the same to the source directory that they were in. There's always a solution to that but I only want to make it if it is really required. This is why I was asking if the thread was already solved or not.
 
Old 09-26-2010, 12:46 PM   #24
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
The problem seems to involve recursively finding all filenames in $DESTDIR, and an optimal solution would do this exactly once. So if the result of this step was saved in a hash table, it should provide the fastest lookup of all $SOURCE filenames in $DESTDIR. I don't think bash can play much of a role in this, but Perl likes hashes...

--- rod.
 
Old 09-27-2010, 12:40 AM   #25
konsolebox
Senior Member
 
Registered: Oct 2005
Distribution: Gentoo, Slackware, LFS
Posts: 2,245
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
Hello theNbomr. Sorry but I think bash can and it could be a lot simpler than Perl's. Also if that's what's really intended, bash 4.0 already has support for associative arrays or hashes.

Only this time that the intended implementation is still not clear. It's just not good to create codes out of uncertain assumptions.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash Script for Moving X number files from /direct1 to /direct2? Supreme1012 Programming 14 01-30-2010 06:08 PM
Trouble with a script to manipulate files within a large number of directories zorblart Programming 1 01-10-2009 03:11 AM
ext3 performance -- very large number of files, large filesystems, etc. td3201 Linux - Server 5 11-25-2008 10:28 AM
commands for bash script that handles files of varying number of lines BBFeltham Linux - Newbie 1 07-26-2008 11:18 AM


All times are GMT -5. The time now is 11:07 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration