Script to pull certain characters from a filename and use in a variable?
Hi All,
I have a script that runs another script in batch mode for a folder of files. The script needs the input file name and a 4-letter abbreviation for that file (the -taxon= flag). Code:
for myfile in *.fas Lottia_gigantea.fas Aplysia_californica.fas Mytilus_edulis.fas I would like the taxon flag to be the first letter of the genus and the first three letters of the species in all caps like this: myscript.pl -sequence_file=Lottia_gigantea.fas -taxon=LGIG myscript.pl -sequence_file=Aplysia_californica.fas -taxon=ACAL myscript.pl -sequence_file=Mytilus_edulis -taxon=MEDU Can anyone help me out? Do I need to use xargs for this? Thanks! Kevin |
Assuming I'm understanding correctly, and that there's always only one underscore separating the filename into two parts, I think this should work.
Code:
for myfile in *.fas |
Hi,
below is one way to do it, using awk: here's my test code: Code:
[root@linuxr LQ]# cat file Code:
TAXON=$(echo $myfile | awk '{split($0,arr,"_"); print toupper(substr(arr[1],1,1)) toupper(substr(arr[2],1,3))}') Code:
-taxon=$TAXON Hope this helps |
Suggestion on style:
Code:
for myfile in *.fas; do The SED answer from David looks like the way to go, but be sure you have defined the algorithm to include all possible file names. @David; Nice tip on the "uppercase flag"!! |
Frankly, given the Perl prog gets the filename to process, why not do it in there?
|
Thanks! Analysis running... :)
|
All times are GMT -5. The time now is 10:06 AM. |