kmkocot |
10-28-2009 06:03 PM |
sed or awk help - need to remove text on each line before a regular expression
Hi all,
I have a file that looks like this:
Code:
>14219|LGIG|61640
MSFEFTIPINLDCLLSKTNVSQYVVEEVLPLRIIPGAVQDFKFAVRNDNFA
>14237|LGIG|86853
PPAGPQQPMVSPNKIVNAATFCRFGQEYIHEIITKATEIFGS
>14286|LGIG|234779
MYIASFVLKMVSNRFLVKVAIGGAIFTLTSISGMKIYIENKFQRQDFYLKSMDLL
>14297|LGIG|139771
QENQSDISQALNQQSDLIEGIYEGGLTIWECGIDLVNYLI
I want to go through and remove the first two regions (but not the greater-than symbol) from every other line so that my file looks like this:
Code:
>61640
MSFEFTIPINLDCLLSKTNVSQYVVEEVLPLRIIPGAVQDFKFAVRNDNFA
>86853
PPAGPQQPMVSPNKIVNAATFCRFGQEYIHEIITKATEIFGS
>234779
MYIASFVLKMVSNRFLVKVAIGGAIFTLTSISGMKIYIENKFQRQDFYLKSMDLL
>139771
QENQSDISQALNQQSDLIEGIYEGGLTIWECGIDLVNYLI
Could anyone help me set up a script to make this happen? It seems like sed and awk could both do this easily but I can't figure out the syntax to tell either to remove all text except the > before the |LGIG|.
Thanks!
Kevin
|