LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-31-2015, 02:43 PM   #1
pinkrain1010
LQ Newbie
 
Registered: Apr 2015
Posts: 7

Rep: Reputation: Disabled
Why is pipe not working for directing multiple files?


Hello,

I am somewhat new to linux, particularly to using for processing large number of files. I am trying to use a one liner to cover a large number of files, but am having issues. Can someone please let me know what is incorrect here and how to get this to work correctly and why?

In .sam.list is a file that lists all of the files names that I am trying to pass through. I'm running this from directory in which the .sam.list file exists and which all the files are in a directory within that directory.

Thank you!

Code:
for i in '.sam.list' ; do java -jar /share/software/picard-tools/1.122/static/AddOrReplaceReadGroups.jar I="$i" O="$i".addedReadGroups.sam RGLB=LaneX RGPU=NONE RGSM=AnySampleName RGPL=Illumina ; done

[Fri Jul 31 11:42:39 PDT 2015] picard.sam.AddOrReplaceReadGroups done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=1058865152
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields; File .sam.list; Line 1
Line: CLCmapfiles/Spinach_GBS_LibA01_Fmapping.sam
        at htsjdk.samtools.SAMLineParser.reportFatalErrorParsingLine(SAMLineParser.java:420)
        at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:210)
        at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:247)
        at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:235)
        at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:211)
        at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:514)
        at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:488)
        at picard.sam.AddOrReplaceReadGroups.doWork(AddOrReplaceReadGroups.java:107)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:185)
        at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:125)
        at picard.sam.AddOrReplaceReadGroups.main(AddOrReplaceReadGroups.java:74)

Last edited by pinkrain1010; 07-31-2015 at 02:45 PM.
 
Old 07-31-2015, 05:04 PM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,676

Rep: Reputation: Disabled
Your for statement doesn't do what you think it does. Try this and see:
Code:
for i in '.sam.list' ; do echo $i ; done
You'll see that $i simply contains the text ".sam.list", not data from that file.

There are numerous ways to parse a file line-by-line, but one that usually works well involves using read in a while loop:
Code:
while read i; do
  java -jar /share/software/picard-tools/1.122/static/AddOrReplaceReadGroups.jar \
    I="$i" \
    O="$i".addedReadGroups.sam \
    RGLB=LaneX \
    RGPU=NONE \
    RGSM=AnySampleName \
    RGPL=Illumina
done < .sam.list
(I split the java command across multiple lines to improve readability.)

The final redirection operator behind the "done" statement is responsible for pulling the contents of ".sam.list" into stdin. The while statement should be read as "while reading a line from stdin into the variable i doesn't result in an end-of-file error, do such-and-such".

You could use a for loop as well, but that's discouraged because it relies on shell expansion. For instance, this:
Code:
for i in $(cat .sam.list); do ....
...will cause the entire contents of ".sam.list" to appear on a single line, and that can fail rather spectacularly for a multitude of reasons: if the file is too big, the shell might choke on the data; if the file contains special characters, they may be interpreted as separators; and so on and so forth. You should probably just avoid using for in this manner.
 
1 members found this post helpful.
Old 08-03-2015, 11:41 AM   #3
pinkrain1010
LQ Newbie
 
Registered: Apr 2015
Posts: 7

Original Poster
Rep: Reputation: Disabled
Thank you very much for the explanation of why my original line was not working as I'd intended. That was quite helpful!

Although when I use either the while or for loop, I am still running into issues (using a different java program on the same list). I am getting a invalid argument error.

Code:
[prompt]$ for i in $(cat .bam.list); do
> java -jar /share/software/picard-tools/1.122/static/SortSam.jar \
> I="$i" \
> O="$i".sorted.bam \
> SORT_ORDER=coordinate \
> done < .bam.list
> done
ERROR: Invalid argument 'done'.
The code was stalling at the original done and would not proceed until including the final done. But then produced the invalid argument error. What do you advise?

Thank you!

Last edited by pinkrain1010; 08-03-2015 at 11:44 AM.
 
Old 08-03-2015, 12:07 PM   #4
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 13,263

Rep: Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198Reputation: 4198
you have now mixed two different solutions:
1.
Code:
for i in $(cat .bam.list); do
#your java command
done
2.
Code:
while read i; do
#your java command
done < .sam.list
 
Old 08-03-2015, 12:11 PM   #5
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,676

Rep: Reputation: Disabled
...and you also have one backslash too many, causing a "done" statement to become part of the preceding line:
Quote:
Originally Posted by pinkrain1010 View Post
Code:
[prompt]$ for i in $(cat .bam.list); do
> java -jar /share/software/picard-tools/1.122/static/SortSam.jar \
> I="$i" \
> O="$i".sorted.bam \
> SORT_ORDER=coordinate \
> done < .bam.list
> done
 
1 members found this post helpful.
Old 08-03-2015, 12:16 PM   #6
pinkrain1010
LQ Newbie
 
Registered: Apr 2015
Posts: 7

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Ser Olmy View Post
...and you also have one backslash too many, causing a "done" statement to become part of the preceding line:
Thank you!!! That was the problem. Seems to be working perfectly now.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Directing traffic through multiple interfaces goumba Linux - Networking 4 05-12-2013 07:48 AM
How to virtually join files (i.e. without directing cat to a new file)? chadwick Linux - General 2 08-23-2009 04:12 PM
Directing Script output to Memory but without using a pipe | telecom_is_me Programming 4 06-29-2008 01:17 AM
[SOLVED] working on files with sed and pipe angel115 Linux - Newbie 4 10-23-2005 07:15 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:24 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration