write to file under certain conditions
Hello,
I have a file with 2 columns looking like this: >GENE1 ACGGTTAGAGCCCAGAGTTGAGACCCGTGGAG >GENE2 NACCCCGATCGTACGRRSTVACCCGA >GENE3 TGCGAGCNNTTTSSR >GENE4 CGATGCTGCGCGATCTCTAGAGAGCCCAG I want to obtain 2 files. One file with the rows of which column 2 contains only A's, C's, T's or G's. And another file with the rows of which column 2 contains also characters other than A's, C's, T's or G's. So in this case: File 1: >GENE1 ACGGTTAGAGCCCAGAGTTGAGACCCGTGGAG >GENE4 CGATGCTGCGCGATCTCTAGAGAGCCCAG File 2: >GENE2 NACCCCGATCGTACGRRSTVACCCGA >GENE3 TGCGAGCNNTTTSSR I really tried several things, but nothing worked :-(. Thanks in advance! |
Code:
egrep '^[^ ]+ +[ACTG]+ *$' # A, C, T, G only Code:
egrep '^[^ ]+ +[^ ]*[^ACTG][^ ]* *$' |
Another suggestion using awk:
Code:
awk '{ if ($2 ~ /[^ACGT]/) print > "file2"; else print > "file1" }' file |
Thank you both very much. This helped me a lot!
|
All times are GMT -5. The time now is 10:58 PM. |