Parsing folders picking files and concatenating them
Hi there,
I am writing a .sh in order to parse folders picking files with file names containing specific signatures (=part of the file name). The code I have written so far is as follows: Code:
for i in $(ls); do folder:SonNot24 31_GCCAAT_L005_R1_001.fastq 31_GCCAAT_L006_R1_001.fastq 33_ACAGTG_L003_R1_001.fastq 33_ACAGTG_L004_R1_001.fastq 35_TGACCA_L001_R1_001.fastq 35_TGACCA_L002_R1_001.fastq Ultimately I would like to cat all these files into one with the name of the folder (SonNot24). Any help is appreciated. Thanks. jahn |
Quote:
Code:
cat /home/daniel/Desktop/LQfiles/*m1*.bin >/home/daniel/Desktop/LQfiles/hugefile.bin Daniel B. Martin |
So Daniel's suggestion is valid, so I will advise a little on your question and general coding:
1. Please use [code][/code] tags around code and data to maintain formatting 2. Do not use 'ls' to feed a for loop (or generally any type of loop), see here for more details 3. Although short and may not last long, if you try using meaningful variable names it can also assist with readability 4. Get in the practice of quoting all variables 5. grep is overkill in this scenario ... Check here and search for regex on the page 6. On regexes (regular expressions), '.' refers to any character, hence, "R1_00." from your code says to look for the string "R1_00" followed by any single character. If you wanted the string followed by a period (.) you need to escape it using either - "\." or "[.]" Hope some of that helps :) |
I never got that regexe to work. The grep -q thing works, though it may be overkill.
Re regex: I look for the string 'R1.fastq.gz' in the file name and use: Code:
if echo "$z" | grep -q "R1.fastq.gz"; Code:
if [[ "$z" == R1"[.]"fastq"[.]"gz ]]; Code:
if [[ "$z" == R1[.]fastq[.]gz ]]; Code:
if [[ "$z" == R1"\."fastq"\."gz ]]; Code:
if [[ "$z" == R1\.fastq\.gz ]]; So, please, anyone :-) Thanks. jahn |
I found the trick to regex in bash is to assign it to a variable using full quoting ('') and then use the bare variable (one of those times when quotes do not help).
Code:
regex='R1[.]fastq[.]gz' 1. == - this is used to test if 2 strings are equal, which your tests clearly are not due to all the superfluous characters, ie the [] 2. If you are going to test an entire string against another then you might as well use the tests you have but with the standard string: Code:
[[ "$z" == "R1.fastq.gz" ]] |
All times are GMT -5. The time now is 12:05 PM. |