Need help with script to replace certain text in file with part of the file's name
Hi all,
I have a directory with about 16,000 files with this format: >LGIG|175428 MSIIIAQTPITYFGSDIQKSLGSLHGFRWAKYPGEKPLPGHNYTGPGISEDKLTALESKL SDDSEIQKQIVAIQQQLINVVDKTQLQNLSSLISNLDDKITKQKKDLKQLIDNINPGISE DKLQRELTKFTTELQKEIKNIDDSVIQQQITTINNEVLKQEKNIAALEKNLKEENKSYFN LPFRNLRDENASISYNIDKSRESEYEKYGITANIIEFFRIQISISKPKAYLMVIVYHIYI SYTGKIILHKDNIKEIKRSKVGKGTELLKKINIYTGRNCYIPTDGNCFIKCVNHVLNKDL TNEFKNFIINFPKVNRKRVMTTARINEFNKKCETSFQIHTLKNRNLRPRDVKRELDWVLY LHNSHFCLIRRNEKNLGIKEIEDNYEQVWKTCRDDNVVTQVSPLKLNVFSNMSDDT >HROB|174996 MIVAHAPKTYFGSGDIQKSLGSLPGFPWAKYPGEKHLPGHNYTGRGTRLDLRLDENNKPK PGEEPVNRVDAAALKHDILYRNKDIKFRHEADKQMIIELENIPNPTFKERMERALIIKLL KAKMKLGTDCIDQMLQRLGKVDQKRLTLISHNGSGFDNWIALQNVKKLTQCPLVVDNKIL SFPLSNPYTEERLQKKWKRQKEIMSNSNYLQNISFTCSFIHQSTSLAAWGNSSNLPMNLK KITDVNIAKFTKETWESLRPE In some of the files there are more or fewer sequences but the definition line always begins with a > symbol. The files are all named like "Moll_10000.fasta", "Moll_10001.fasta" and so on... I am trying to write a script that reads the name of each file, strips out the number portion of the name ($NUMBER), and replaces all instances of ">" with ">$NUMBER|". Here is what I tried (but didn't work). Can anyone point me in the right direction? Thanks!!! Code:
COUNTER=10000 |
The sed command is incomplete. It should be something like:
Code:
sed -i "s/>/>${COUNTER}|/g" $i Edit: a more simple version for your script could be: Code:
#!/bin/bash |
Thanks! Your alternative is much more versatile. I really appreciate the help!
Kevin |
All times are GMT -5. The time now is 11:28 AM. |