Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need to do a "cat" for R1 and R2 at each sample, so i create a file name "samples.txt" were each line is the name of one sample
ex.
AE001
AE134
.
.
.
(for my 10 samples)
and i tried to run the follow script:
for sample in `cat samples.txt`;
do
cat $sample*R1*fastq.gz > $sample_L001-4_R1_001.fastq.gz | mv $sample_L001-4_R1_001.fastq.gz ../1.FASTQC_1/
cat $sample*R2*fastq.gz > $sample_L001-4_R2_001.fastq.gz | mv $sample_L001-4_R2_001.fastq.gz ../1.FASTQC_1/;
done
I would expect to have 2 files (*_L001-4_R1_001.fastq.gz and *_L001-4_R2_001.fastq.gz) for each sample but it only generates 2 files por the 10 samples named as follow:
-4_R1_001.fastq.gz and -4_R2_001.fastq.gz
I would appreciate your help as i am new in this and its too early to get crazy already.
I'm not sure I entirely understand how you've structured your data, but I suspect ( if you are using BASH ) then as far as the script is concerned, you may want something more like this:
Code:
#!/bin/bash
for sample in `cat samples.txt`
do
cat ${sample}*R1*.fastq.gz > ${sample}_L001-4_R1_001.fastq.gz | mv
${sample}_L001-4_R1_001.fastq.gz ../1.FASTQC_1/
cat ${sample}*R2*.fastq.gz > ${sample}_L001-4_R2_001.fastq.gz | mv
${sample}_L001-4_R2_001.fastq.gz ../1.FASTQC_1/
done
In which case I'm guessing an entry in samples.txt would look something like:
Code:
AE001_S01
But the main point is to delimit the variable name within the script, to avoid confusion.
Thank you very much @rigor and @pan64. I think that works. I didn`t understand the "insted of | " and why i need to use {}. I have done others loop in bash and the {} were not necessary.
In your case, you are creating a new file, then want to move it, hence separate cmds with ';'
'|' is for 'piping' data from the o/p of one cmd to the i/p of another
2. Re {}
There are some slightly arcane rules involved in how bash 'finds' embedded variable names. The safest approach to avoid confusing it is to always surround the varname with {} if it is embedded (inc at start) of another string.
For consistency and to make it standout for yourself later/other devs, I usually also use {} even if the varname appears at the end of a string
The problem is that the underscore character is a legal character in a name, and the shell will take all characters up to the first "impossible" character as part of the name. So, "$sample_L001-4_R1_001.fastq.gz" is parsed as "${sample_L001}-4_R1_001.fastq.gz", and you don't have a variable named "sample_L001". If you don't want the "_L001" to be treated as part of the variable name, then you need to write that as "${sample}_L001-4_R1_001.fastq.gz".
The problem is that the underscore character is a legal character in a name, and the shell will take all characters up to the first "impossible" character as part of the name. So, "$sample_L001-4_R1_001.fastq.gz" is parsed as "${sample_L001}-4_R1_001.fastq.gz", and you don't have a variable named "sample_L001". If you don't want the "_L001" to be treated as part of the variable name, then you need to write that as "${sample}_L001-4_R1_001.fastq.gz".
That`s exactly what was happening to me."_L001" was being treated as part of the variable name. Thank you for the tips., are very useful !!
In your case, you are creating a new file, then want to move it, hence separate cmds with ';'
'|' is for 'piping' data from the o/p of one cmd to the i/p of another
2. Re {}
There are some slightly arcane rules involved in how bash 'finds' embedded variable names. The safest approach to avoid confusing it is to always surround the varname with {} if it is embedded (inc at start) of another string.
For consistency and to make it standout for yourself later/other devs, I usually also use {} even if the varname appears at the end of a string
HTH
Ohh ok. Thank you. I understand. Thank you very much for the help, it was really useful!.
Best regards. I really appreciate the help. See you in a next problem!!
Whenever someone is asking question in the "newbie" forum, I am torn between correcting every little detail that might be wrong, versus only focusing on what needs to change, to get something working.
It might be fair to say there are about three "major" forms of "connecting commands" ( in BASH ) that come to mind, those being the following:
Code:
command_one | command_two
Code:
command_one || command_two
Code:
command_one && command_two
As has effectively already been mentioned, the first form takes the so called "standard" output from the first command "command_one" and delivers it as the so called "standard" input for "command_two". The other two forms execute the second command "command_two" conditional in a sense, based on the success or failure of "command_one". I thought perhaps you had read about the form involving the "||" operator, and merely neglected to include the second "|". Since in this case, I believe it just happens not to cause a problem, I didn't mention the distinction. However, for all the single "|" accomplishes in this particular case, you might was well replace the "|" character with the ";" character.
almost...
| is a piping char, that means stdin redirection - as you wrote, but ; is just a simple command delimiter, nothing will be redirected and also no condition will be evaluated.
almost...
| is a piping char, that means stdin redirection - as you wrote, but ; is just a simple command delimiter, nothing will be redirected and also no condition will be evaluated.
pan64,
Perhaps you could enlighten me as to exactly what I wrote, which made you feel that I asserted anything different, from the distinction which you made. I did not suggest that a pipe character is a conditional operator. I merely indicated that I thought using a pipe when there was nothing to pipe, would not cause any problems, in a situation such as this. I actually tested that to be sure that my thought was accurate.
Or as Monty Python could have said, Ecce nullum data, ergo pipe. :-)
In any event, since I now see the original problem has been marked as solved, let's not confuse the issue.
However, for all the single "|" accomplishes in this particular case, you might was well replace the "|" character with the ";" character.
The best I can say is: using pipe in this case may cause unpredictable result, and as a side effect if it occasionally will do what you wish then it is happening just because this case is not handled properly.
Code:
cat file1 > file2
# will not and cannot send anything to stdout or pipe
# the command mv cannot accept anything from stdin (or pipe)
cat file1 > file2 | mv file2 dir
# you ought to use instead:
cat file1 > dir/file2
# or even better:
cp file1 dir/file2
with other words: instead of you might replace better to write you must replace
I would suggest that since the particular problem for which this thread was started is considered solved by the OP, the issue that pan64 raises should be discussed in a separate thread which I started here: Does BASH handle piping properly when there is no data to pipe?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.