LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-23-2016, 07:57 AM   #1
whlte
LQ Newbie
 
Registered: Mar 2016
Posts: 6

Rep: Reputation: Disabled
Complex copying command


if anyone can help me that would be much appreciated.

I have a folder with about 1700 folders. I have a list of numbers in csv which correlates to a subset of these folders which I am interested in (about 1300). The folder labels contain(but are not the same as) these numbers from csv. I need to copy 1 out of the 2 files in each folder(out of the 1700). The file I need to copy contains a unique ending _dc_brain_FA.nii (compared to the other file in each folder). I would like to copy these files into a single folder containing the 1300 or so _dc_brain_FA.nii* files.

Someone mentioned to me to try

find SOURCEDIR -name "*_dc_brain_FA.nii" | grep -f PATTERNFILE | xargs -d "\n" cp -t TARGETDIR

but when I do that I keep getting "cp: missing file operand
 
Old 03-23-2016, 08:32 AM   #2
rtmistler
Moderator
 
Registered: Mar 2011
Location: Sutton, MA. USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu
Posts: 4,475
Blog Entries: 10

Rep: Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639
Does the first part of the find, "find" all the files you want?
Code:
$ find SOURCEDIR -name "*_dc_brain_FA.nii"
If so, then you can add an -exec argument to the find:
Code:
$ find SOURCEDIR -name "*_dc_brain_FA.nii" -exec cp {} TARGETDIR \;
The -exec starts with that term and ends with the \, in the middle, you have the cp and TARGETDIR and the term {} means "each file name found"
 
1 members found this post helpful.
Old 03-23-2016, 11:04 AM   #3
whlte
LQ Newbie
 
Registered: Mar 2016
Posts: 6

Original Poster
Rep: Reputation: Disabled
Dear rtmistler,

Thank you for your prompt reply. That command actually did not find the files I was looking for.

However, I have simplified what I need to do now by secure copying all the "_dc_brain_FA.nii" into one folder. However, I still need to take out my subset of 1300 from the 1700 files. Can I get instruction on how to do this? I have the CSV file with my subset in a column of numbers, and the files are labelled accordingly.
 
Old 03-23-2016, 11:06 AM   #4
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
If you intend SOURCEDIR, PATTERNFILE, and TARGETDIR to be variables, the correct syntax to refer to them (to "read them out") is $SOURCEDIR, $PATTERNFILE, and $TARGETDIR.

It is good practice to put double quotation marks around it too, in case the variable contains whitespace or certain other characters.

Example:
Code:
$ SOURCEDIR="/home/beryllos/my top secret files"


$ cd SOURCEDIR # Here we forget the $
bash: cd: SOURCEDIR: No such file or directory


$ cd $SOURCEDIR # Here we forget the quotation marks
bash: cd: /home/beryllos/my: No such file or directory


$ cd "$SOURCEDIR" # This command works as expected.
$ pwd
/home/beryllos/my top secret files
$
 
2 members found this post helpful.
Old 03-23-2016, 11:36 AM   #5
whlte
LQ Newbie
 
Registered: Mar 2016
Posts: 6

Original Poster
Rep: Reputation: Disabled
Beryllos,

Thanks. I am now able to find the files of interest, but still get the cp error. For some reason the quotation marks helped but not the $ though
 
Old 03-23-2016, 11:36 AM   #6
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
Quote:
Originally Posted by whlte View Post
Dear rtmistler,

Thank you for your prompt reply. That command actually did not find the files I was looking for.

However, I have simplified what I need to do now by secure copying all the "_dc_brain_FA.nii" into one folder. However, I still need to take out my subset of 1300 from the 1700 files. Can I get instruction on how to do this? I have the CSV file with my subset in a column of numbers, and the files are labelled accordingly.
Your original command might work if you modified it like so:
Code:
find "$SOURCEDIR" -name "*_dc_brain_FA.nii" | grep -f "$PATTERNFILE" | xargs -d "\n" cp -t "$TARGETDIR"
but now that you have copied all the files to one folder, you could use ls and pipe the filenames to grep.

For example, if the folder contains only the "*_dc_brain_FA.nii" and no other files or folders, you could use:
Code:
cd folder_containing_the_files
ls | grep -f "$PATTERNFILE" | xargs -d "\n" cp -t "$TARGETDIR"
If the folder contains other files or folders, use ls with an argument to select the files you need:

Code:
cd folder_containing_the_files
ls *_dc_brain_FA.nii | grep -f "$PATTERNFILE" | xargs -d "\n" cp -t "$TARGETDIR"
 
Old 03-23-2016, 11:41 AM   #7
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
Quote:
Originally Posted by whlte View Post
Beryllos,

Thanks. I am now able to find the files of interest, but still get the cp error. For some reason the quotation marks helped but not the $ though
Could you show us the command line you have used, and the output (or a reasonable selection if it is quite lengthy).

In your reply, please also enclose commands and output within [CODE] and [/CODE] tags if possible.

Last edited by Beryllos; 03-23-2016 at 12:03 PM.
 
Old 03-23-2016, 12:10 PM   #8
whlte
LQ Newbie
 
Registered: Mar 2016
Posts: 6

Original Poster
Rep: Reputation: Disabled
000024118708_dti_ecc_dc_brain_FA.nii 000048054415_dti_ecc_dc_brain_FA.nii 000074277356_dti_ecc_dc_brain_FA.nii 000099514690_dti_ecc_dc_brain_FA.nii
000024161683_dti_ecc_dc_brain_FA.nii 000048109235_dti_ecc_dc_brain_FA.nii 000074304888_dti_ecc_dc_brain_FA.nii 000099529479_dti_ecc_dc_brain_FA.nii
000024346109_dti_ecc_dc_brain_FA.nii 000048135131_dti_ecc_dc_brain_FA.nii 000074327708_dti_ecc_dc_brain_FA.nii 000099550415_dti_ecc_dc_brain_FA.nii
000024436422_dti_ecc_dc_brain_FA.nii 000048201226_dti_ecc_dc_brain_FA.nii 000074384224_dti_ecc_dc_brain_FA.nii 000099604669_dti_ecc_dc_brain_FA.nii
000024447518_dti_ecc_dc_brain_FA.nii 000048284598_dti_ecc_dc_brain_FA.nii 000074452500_dti_ecc_dc_brain_FA.nii 000099616225_dti_ecc_dc_brain_FA.nii
000024452285_dti_ecc_dc_brain_FA.nii 000048310052_dti_ecc_dc_brain_FA.nii 000074586509_dti_ecc_dc_brain_FA.nii 000099875982_dti_ecc_dc_brain_FA.nii
000024477110_dti_ecc_dc_brain_FA.nii 000048386902_dti_ecc_dc_brain_FA.nii 000074629155_dti_ecc_dc_brain_FA.nii 000099888850_dti_ecc_dc_brain_FA.nii
000024495953_dti_ecc_dc_brain_FA.nii 000048473246_dti_ecc_dc_brain_FA.nii 000074709697_dti_ecc_dc_brain_FA.nii 000099930021_dti_ecc_dc_brain_FA.nii
000024518088_dti_ecc_dc_brain_FA.nii 000048550187_dti_ecc_dc_brain_FA.nii 000074731388_dti_ecc_dc_brain_FA.nii 000099954902_dti_ecc_dc_brain_FA.nii
[kongth@lonsdale01 dcfiles]$ cd ..
[kongth@lonsdale01 HPC_16_00917]$ pwd
/projects/pi-bokdea/HPC_16_00917
[kongth@lonsdale01 HPC_16_00917]$ ls
data dcfiles fulldata fulldata.csv important.csv scripts template
[kongth@lonsdale01 HPC_16_00917]$ cd dcfiles
[kongth@lonsdale01 dcfiles]$ ls | grep -f "$/projects/pi-bokdea/HPC_16_00917/fulldata.csv" | xargs -d "\n" cp -t "$/projects/pi-bokdea/HPC_16_00917/fulldata"
grep: $/projects/pi-bokdea/HPC_16_00917/fulldata.csv: No such file or directory
cp: accessing `$/projects/pi-bokdea/HPC_16_00917/fulldata': No such file or directory
 
Old 03-23-2016, 01:01 PM   #9
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
Okay. I misunderstood the command in your original post. Now I understand that SOURCEDIR, PATTERNFILE, and TARGETDIR were not variables, but only your personal shorthand (like pseudo-code) for the actual folders and files.

In that case, you do not need the "$" prefix and it actually creates additional errors.

Also, for these folder and file names, you do not need quotation marks, though they do no harm.

Therefore try the following command sequence:
Code:
cd dcfiles
ls | grep -f "/projects/pi-bokdea/HPC_16_00917/fulldata.csv" | xargs -d "\n" cp -t "/projects/pi-bokdea/HPC_16_00917/fulldata"

If that doesn't work, you could break it apart to see where it is failing. For example:
Code:
# Go to the folder:
cd dcfiles


# See that the files are correctly listed:
ls


# Make sure the grep file exists and has the correct content:
cat /projects/pi-bokdea/HPC_16_00917/fulldata.csv


# Inspect the output of grep:
ls | grep -f "/projects/pi-bokdea/HPC_16_00917/fulldata.csv"


# Inspect the output of xargs:
ls | grep -f "/projects/pi-bokdea/HPC_16_00917/fulldata.csv" | xargs -d "\n" echo
If all of that works, but the whole command fails, then we have to do more troubleshooting.
 
Old 03-23-2016, 01:15 PM   #10
whlte
LQ Newbie
 
Registered: Mar 2016
Posts: 6

Original Poster
Rep: Reputation: Disabled
Still did not work.

# Go to the folder:
cd dcfiles


# See that the files are correctly listed:
ls
This was fine


# Make sure the grep file exists and has the correct content:
cat /projects/pi-bokdea/HPC_16_00917/fulldata.csv

This was fine. it just showed a vertical column of the numbers I am interested in and match parts of the file names(they do not match the whole file name)


# Inspect the output of grep:
ls | grep -f "/projects/pi-bokdea/HPC_16_00917/fulldata.csv"

no output


# Inspect the output of xargs:
ls | grep -f "/projects/pi-bokdea/HPC_16_00917/fulldata.csv" | xargs -d "\n" echo

No output
 
Old 03-23-2016, 03:14 PM   #11
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
Now we're getting somewhere! Is the csv file from Windows? Windows uses an extra character ("carriage return") at the end of each line, and may have an extra character at the end of the file. The Windows flavor (also known as "DOS") does not work with grep.

One way to diagnose the issue is with the file command. In the following example, I examine a grepfile made by Linux, and its Windows counterpart.

The files look the same by cat:
Code:
$ cat grepfile
002
003
004
$ cat grepfile.win
002
003
004
The difference is detected by file:
Code:
$ file gr*
grepfile:     ASCII text
grepfile.win: ASCII text, with CRLF line terminators
Only one of the files works with grep:
Code:
$ ls | grep -f grepfile.win
$ ls | grep -f grepfile
algebra page 002.odt
algebra page 003.odt
algebra page 004.odt
You can fix it by the following procedure, but replacing my two filenames with your own:
Code:
$ cat grepfile.win | tr -d '\15\32' > grepfile.new
The tr command shown above deletes all occurrences of the extraneous character codes.

Now it works:
Code:
$ file grepfile.new
grepfile.new: ASCII text
$ ls | grep -f grepfile.new
algebra page 002.odt
algebra page 003.odt
algebra page 004.odt
 
1 members found this post helpful.
Old 03-23-2016, 03:19 PM   #12
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
By the way, there are other methods to convert DOS/Windows text files to Linux. Some text editors may be able to perform the conversion. I use gedit (in Linux), and it has an option to change the line ending in the "Save As" dialog.

Last edited by Beryllos; 03-23-2016 at 03:21 PM.
 
Old 03-23-2016, 03:21 PM   #13
rtmistler
Moderator
 
Registered: Mar 2011
Location: Sutton, MA. USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu
Posts: 4,475
Blog Entries: 10

Rep: Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639Reputation: 1639
I'm just wondering if there's an easier way to pipe the output of the CSV file into find, one filename at a time and then use the -exec argument to cp or mv just that one found file. And then be iterating with however many find commands match up against the number of filenames in the CSV file. I'm just not well versed on that form of input. Don't you do something like?:
Code:
cat<<filename.csv | find . -name (unsure what goes here) -exec cp {} target-dir \;
 
Old 03-24-2016, 11:52 AM   #14
whlte
LQ Newbie
 
Registered: Mar 2016
Posts: 6

Original Poster
Rep: Reputation: Disabled
grand. It was from windows. After following your instructions and creating fulldata.new. The original command

Quote:
[kongth@lonsdale01 dcfiles]$ ls | grep -f "$/projects/pi-bokdea/HPC_16_00917/fulldata.new" | xargs -d "\n" cp -t "$/projects/pi-bokdea/HPC_16_00917/fulldata"
worked.

Counted the number of files in the folder which matched the number of folders in the list and went through a few so it should be correct.

Thanks so much for your help guys... though I doubt it will be the last time I am here.
 
Old 03-24-2016, 01:30 PM   #15
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 314

Rep: Reputation: 122Reputation: 122
Glad to hear it is working.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
drag-drop copying and cp command both hang at last few KB when copying to flash drive slacker_ Linux - Newbie 1 09-05-2013 08:17 AM
Korn Shell syntax issue for remote complex command. toordog Programming 14 04-02-2011 04:31 AM
Download 'complex' links via command line alphaniner Linux - General 3 02-06-2010 03:44 PM
complex find command ovince Programming 1 03-09-2007 06:46 PM
complex command lines in bash? 3inone Linux - Newbie 1 04-20-2004 07:43 PM


All times are GMT -5. The time now is 01:15 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration