LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   replace multiple patterns from a file (https://www.linuxquestions.org/questions/linux-newbie-8/replace-multiple-patterns-from-a-file-949117/)

jv61 06-07-2012 08:18 PM

replace multiple patterns from a file
 
Hi all,

I have a fasta file that looks like this

Code:


>consensus_5353
 ACTCGGAGGTCCGATCCAAAGTTTTTCTTTTCAGTGCCGAGTAGAGTTAC
>consensus_5354
 GCCGGTGGTGGCGGAAGCTGGTGGTTGGCCGGCCGGGTCGAGTGGAGGTC
>consensus_5355
 AGCAGGAGTCCCGCCGCCCTCGACCTCTCTTCTTCCGCCGCCGCCCGGTG
>consensus_5356
 AATTAGTGGATTATTTGGTGAGGTGGATGTAGAGTGTAGACCGTATAATT
>consensus_5357
 GCAATACTCAAATTGGAATAGGATGAGCAAGGAAGAGGAAAATGGTGGGG
>consensus_5358
 TAAGATGTTCTGTTAGGGACTCAGAAGAACATCAGAATCACTACTTACGT
>consensus_5359
 TGCTGGACTTGCTCTTAGCCTCCGATCGTCCCTATAGACTTTTGGCCTTT
>consensus_5360
 ACGTGCCGCACGTGGGGTACAAAACCACCGCGGCGTAGGAGACGTCAAAA
>consensus_5361
 TCGGCGGGGAGCAGCTAGTAACGCGCATTAACACGAGCAAATCCTAGAGA

I have another file with the list of patterns. My pattern list file looks like this

Code:

>consensus_5353
>consensus_5357
>consensus_5359
>consensus_5361

What I want to do is to look for the patterns in my fasta file and delete those patterns and the next line of sequence corresponding to that pattern. Any thoughts how do I go about it with sed or awk or grep or perl?

I knew how to use sed to delete a single pattern but how to delete multiple patterns from a file instead of writing something like this

sed -e '/pattern/d' -e '/pattern/d' file

Thanks in advance.

mreff555 06-07-2012 08:52 PM

sed s/pattern//g input > output

Probably not the best way to do it but it works.

bsat 06-08-2012 01:48 AM

Here is a script that might help.

Please note that I have used sed with "-i" so it will change your actual file. It is preferable you try this script on a backup file to make sure things don't go wrong.


Code:


#!/bin/bash
while read line
do
sed -i "/$line/ N;s/$line.*//" fasta
done < pattern


jv61 06-08-2012 04:49 AM

Quote:

Originally Posted by bsat (Post 4698464)
Here is a script that might help.

Please note that I have used sed with "-i" so it will change your actual file. It is preferable you try this script on a backup file to make sure things don't go wrong.


Code:


#!/bin/bash
while read line
do
sed -i "/$line/ N;s/$line.*//" fasta
done < pattern


Many thanks for the replies. The above script solved my query.


All times are GMT -5. The time now is 04:25 AM.