Truncating lines using sed
Hi
I am trying to truncate some sequence names to only the first 9 characters using sed. The lines with the name are designated by a ">" For example: I need this: >H.neanderthalensis ATGAAATTCACGCTCAGCTCGATCGCTAGCTAGC >R.norweignensis ATGCTCGCTCGATCGCTAGCTCGATCGCTAGCTC to be truncated to this >H.neadert ATGAAATTCACGCTCAGCTCGATCGCTAGCTAGC >R.norweig ATGCTCGCTCGATCGCTAGCTCGATCGCTAGCTC This should be simple, any suggestions? |
Code:
sed -e 's/\(>.........\).*/\1/' |
Using extended regexp, you can try to match 9 characters after the leading > and use parentheses to keep the pattern. Here we go:
Code:
sed -r 's/(^>.{9}).*/\1/' file |
colucix's solution is actually more precise. The ^ character anchors the regular expression to the beginning of the line, and with -r, allows more concise extended regular expressions.
|
All times are GMT -5. The time now is 06:18 AM. |