netpumber |
06-19-2014 08:20 AM |
Finding specific text from a file that is within specific symbols
Hello. I have a big file that contains DNA sequences like this :
Quote:
>gnl|SRA|SRR035295.82647.2 FIHSSUW02I7CY6.2 length=269
TAGAGACCGAGGCGGCCGACATGTTTTGTTTTTTTTTCTTTTTTTTTTCCGTCCAACATGGAATGATTGG
TACGCATCTGCAAATTCTTTGGATGTCACAAATCTGTATGGTGCGTCTCTTCTCATCCAGTATTGCTCCT
GATCTTTTTTTGAAGTCACTTCTTGTAAGAAATCAGCAACGCCTTTCCTTGCAGGGCATTTAAATCCCAT
TGACTCAAAAAACTCAAGGACGTGTTCACGTGGGCCTTGATATACAATTTTGCCATCAG
>gnl|SRA|SRR035295.4505.2 FIHSSUW02H007H.2 length=250
AAGCAGTGGTATCAACGAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTGA
TGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGTT
ACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGACT
GTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGGCG
>gnl|SRA|SRR035296.68126.2 FIQ4L3X01D8M3K.2 length=259
AAACATAATTATACCCTTGCTGAACTCGGCACCAATACTTTGCTTGATCTTTTCTTGAGACAACCTCTTG
GAGGAAATCGGCAACGCCTTTTCTTTCAGGGCACCTAAAACCAAAACCCTCAAAATACTCTAATACATCA
CTTCTCGGCCCATGATAAATAATGACTCCTTCTGCCATCAACATAATGTCATCAAAGAGATCAAATACTT
CTGGTGCTGGTTGAAGAAGTGAAATGACCACAGAAGCGTCTGTTATATG
>gnl|SRA|SRR035294.13646.2 FIHSSUW01ERMVS.2 length=248
AGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTGA
TGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGTT
ACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGACT
GTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
>gnl|SRA|SRR035296.38443.2 FIQ4L3X01BO4OB.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
>gnl|SRA|SRR035296.36031.2 FIQ4L3X01DKY6J.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
>gnl|SRA|SRR035295.53565.2 FIHSSUW02J2P5E.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
>gnl|SRA|SRR035294.113925.2 FIHSSUW01BDZ3B.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
>gnl|SRA|SRR035294.94312.2 FIHSSUW01EADXQ.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
>gnl|SRA|SRR035294.74028.2 FIHSSUW01E2UJV.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
|
and i want to print out e.g the sequence with id = SRR035294.94312.2
As you can see every sequence has an id and each of them start with a > symbol.
So is there anyway to make a cat in the file and print out the text that contains the id and it is between within two > symbols ?
So to get back
Quote:
>gnl|SRA|SRR035294.94312.2 FIHSSUW01EADXQ.2 length=249
AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGGTCCTCCAAGTTCGGGAAAGACAACACTTTTG
ATGGCCTTGGCTGGCACACTTGCAAAAGAGCTTAAGAGTTCGGGTAAAGTAACATATAATGGGCATGAGT
TACATGAGTTTGTACCTGAAAGAACTGCTGCTTATATCAGCCAGAATGATCTCCATATTGGAGAAATGAC
TGTAAGAGAAACATTGGCTTTCTCTGCAAGATGTCAAGG
|
Thank you.
|