LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-21-2009, 06:38 PM   #1
kmkocot
Member
 
Registered: Dec 2007
Location: Tuscaloosa, AL
Posts: 126

Rep: Reputation: 15
Question Need help with script to replace certain text in file with part of the file's name


Hi all,

I have a directory with about 16,000 files with this format:

>LGIG|175428
MSIIIAQTPITYFGSDIQKSLGSLHGFRWAKYPGEKPLPGHNYTGPGISEDKLTALESKL
SDDSEIQKQIVAIQQQLINVVDKTQLQNLSSLISNLDDKITKQKKDLKQLIDNINPGISE
DKLQRELTKFTTELQKEIKNIDDSVIQQQITTINNEVLKQEKNIAALEKNLKEENKSYFN
LPFRNLRDENASISYNIDKSRESEYEKYGITANIIEFFRIQISISKPKAYLMVIVYHIYI
SYTGKIILHKDNIKEIKRSKVGKGTELLKKINIYTGRNCYIPTDGNCFIKCVNHVLNKDL
TNEFKNFIINFPKVNRKRVMTTARINEFNKKCETSFQIHTLKNRNLRPRDVKRELDWVLY
LHNSHFCLIRRNEKNLGIKEIEDNYEQVWKTCRDDNVVTQVSPLKLNVFSNMSDDT
>HROB|174996
MIVAHAPKTYFGSGDIQKSLGSLPGFPWAKYPGEKHLPGHNYTGRGTRLDLRLDENNKPK
PGEEPVNRVDAAALKHDILYRNKDIKFRHEADKQMIIELENIPNPTFKERMERALIIKLL
KAKMKLGTDCIDQMLQRLGKVDQKRLTLISHNGSGFDNWIALQNVKKLTQCPLVVDNKIL
SFPLSNPYTEERLQKKWKRQKEIMSNSNYLQNISFTCSFIHQSTSLAAWGNSSNLPMNLK
KITDVNIAKFTKETWESLRPE

In some of the files there are more or fewer sequences but the definition line always begins with a > symbol. The files are all named like "Moll_10000.fasta", "Moll_10001.fasta" and so on...

I am trying to write a script that reads the name of each file, strips out the number portion of the name ($NUMBER), and replaces all instances of ">" with ">$NUMBER|".

Here is what I tried (but didn't work). Can anyone point me in the right direction? Thanks!!!

Code:
COUNTER=10000
FILES=*.fasta

for i in $FILES
do
sed 's/>/>|$COUNTER/g'
COUNTER=COUNTER+1
done
 
Old 08-21-2009, 06:52 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
The sed command is incomplete. It should be something like:
Code:
sed -i "s/>/>${COUNTER}|/g" $i
the -i option (very dangerous without testing) edits the file in place, the file name is given as argument $i, double quotes are used to let the shell substitute the variable COUNTER with its actual value. Test it on some copies of the original files, before modifying them.

Edit: a more simple version for your script could be:
Code:
#!/bin/bash
for file in *.fasta
do
  #
  # extract the digits part from the file name
  #
  counter=$(echo $file | egrep -o [0-9]{5})
  #
  # edit the file
  #
  sed -i "s/>/>${counter}|/g" $file
done
again, test it before executing on the original files.

Last edited by colucix; 08-21-2009 at 07:05 PM.
 
Old 08-23-2009, 04:06 PM   #3
kmkocot
Member
 
Registered: Dec 2007
Location: Tuscaloosa, AL
Posts: 126

Original Poster
Rep: Reputation: 15
Thanks! Your alternative is much more versatile. I really appreciate the help!

Kevin
 
  


Reply

Tags
grep, sed



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
bash script to create text in a file or replace value of text if already exists knightto Linux - Newbie 5 09-10-2008 11:13 PM
Need a script to search and replace text in file using shell script unixlearner Programming 14 06-21-2007 10:37 PM
Help! Script or commanded needed to replace text in a file farmerjoe Programming 3 01-02-2005 05:59 PM
help! Script or command needed to replace text in a file. farmerjoe Linux - Newbie 2 01-02-2005 03:07 PM
Script to search and replace in text file - kinda... jeffreybluml Programming 45 11-07-2004 05:37 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration