LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 07-15-2012, 05:22 AM   #1
compused
Member
 
Registered: Oct 2006
Location: Melbourne Australia
Distribution: centos and redhat 8
Posts: 91

Rep: Reputation: 15
cp erroneously 'will not overwrite' because of duplicate filenames


I am trying to restore time & date stamps to files that lost those attributes a year or so ago in a server migration. So I am trying to select and copy the new files created since the migration to (a copy of) the original files, which will then become the working directory:
Code:
find /share/_docs/* -newer ./DateFile -print0 | xargs -0 cp -Rp -t /share/_docs2
where ./DateFile carries the timestamp of the server migration.

A problem arises because of the presence of duplicate filenames, and even though they are quarantined in different directories, cp stops with the error msg:
Code:
cp: will not overwrite just-created (filename) with (same filename but in different directory)
I have found this reference, which I can't really follow:
http://stackoverflow.com/questions/2...lder-structure

Could anyone help?
Compused

Last edited by compused; 07-15-2012 at 05:27 AM.
 
Old 07-16-2012, 12:39 AM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Your find command is matching on both files and directories. When it encounters a directory, it passes that name to cp, which replicates that subtree, including both new and old files, in the target directory, /share/_docs2. But then, find itself recurses into that directory, finds any new files (and subdirectories again!), and passes those names to cp, which will copy them directly into /share/_docs2. If in _docs you have this:
Code:
dir1/
     file1
     subdir1a/
              file1a
and all of the files and directories satisfy the "-newer" test, the command line produced by xargs will be:
Code:
cp -Rp -t /share/_docs2 \
    /share/_docs/dir1 /share/_docs/dir1/file1 /share/_docs/dir1/subdir1a /share/_docs/subdir1a/file1a
and your result in /share/_docs2 will be this:
Code:
dir1/
     file1
     subdir1a/
             file1a
file1
subdir1a/
        file1a
file1a
The whole tree gets replicated in the target, but every element of that tree gets copied again directly to the target directory, regardless of its original place in the tree. Any duplicate names will cause a conflict, but that is really secondary to the whole operation being fundamentally flawed.

A quick solution to this eludes me at the moment. Anyone???
 
Old 07-16-2012, 07:58 AM   #3
compused
Member
 
Registered: Oct 2006
Location: Melbourne Australia
Distribution: centos and redhat 8
Posts: 91

Original Poster
Rep: Reputation: 15
thanks rk, you have explained why the output from the string is quite a mess. My efforts with this approach have been exhausted. If I know more about Perl perhaps that would be an option
Compfused
 
Old 07-16-2012, 02:11 PM   #4
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
I believe this will do the trick:
Code:
find /share/_docs -newer ./DateFile -printf '%P\0' | rsync -a --from0 --files-from=- /share/_docs /share/_docs2
Note that in rsync the "--files-from" option implies:
  • --relative (preserves the path information for each item in the file)
  • --dirs (creates directories specified in the list rather than noiselessly skipping them)
  • Changes the "-a" (--archive) option so that it does not imply recursion.
The result is that find sends null-terminated names without the leading "/share/_docs" component in a list that rsync reads on stdin ("--files-from=-"). rsync then looks for these names in /share/_docs and copies what it finds, preserving the path read from find, to the destination, all without doing any recursion itself.

Last edited by rknichols; 07-16-2012 at 02:13 PM.
 
Old 07-17-2012, 08:57 AM   #5
compused
Member
 
Registered: Oct 2006
Location: Melbourne Australia
Distribution: centos and redhat 8
Posts: 91

Original Poster
Rep: Reputation: 15
Hi rk....thanks very much for working on this....

First 3 lines of output are:
Code:
find: unrecognized: -printf
BusyBox v1.10.3 (2010-05-17 05:57:25 UTC) multi-call binary

Usage: find [PATH...] [EXPRESSION]
It is not recognising 'printf'

Any ideas?
 
Old 07-17-2012, 09:46 AM   #6
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
How lovely -- a stripped-down version of find that lacks the "-printf" action. OK, let's use another way to keep find from including the "/share/_docs" part of the paths in its output:
Code:
cd /share/_docs && find . -newer /full/path/to/DateFile -print0 | rsync -a --from0 --files-from=- . /share/_docs2
You'll have to adjust that "/full/path/to/DateFile", of course.
 
1 members found this post helpful.
Old 07-18-2012, 10:25 AM   #7
compused
Member
 
Registered: Oct 2006
Location: Melbourne Australia
Distribution: centos and redhat 8
Posts: 91

Original Poster
Rep: Reputation: 15
Hi rk
Thanks again, I should have alerted that busybox was involved.
I think we have a solution!

Initially the script did not work as I had failed to correctly transcribe the first part of the script, ie I had
Code:
find /share/_docs
instead of
Code:
cd /share/_docs && find .
It shows how much of this I don't fully understand!
Thanks again rk,
Compfused (somewhat less compfused now at least!)
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to get wget to deal properly with duplicate filenames like old times? comcastuser Linux - Software 8 06-19-2012 10:00 PM
How to list duplicate filenames wonfineday Linux - Newbie 7 12-07-2011 12:51 AM
Need a Download Mgr that renames duplicate filenames RyMcV Linux - Software 1 07-01-2007 03:51 PM
Duplicate filenames in kernel source bz2 archive amn Linux - Newbie 4 03-20-2007 04:49 AM
e16.7 loads epplets erroneously german Linux - Software 1 03-20-2005 03:09 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 08:26 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration