Script to count # of chars per line (if line meets certain criteria) and get avg #?
Hi all,
I have several files with many lines something like this: >Contig155; 141 117 387 minus strand; translated MVRQEVGRYSYNPLAQCRCFTGLRKLGNLISTEVEFTLVSIPKPCSISVLGTLWVQNLGTHTFLTGAGQSQGRDKKKGDDAGFHVVDVR >Contig154 188 1 440 ; translated SFSSFLHSHLEESFKMVRMNVLADALKSICNAEKRGKRQVLIRPCSKVIVKFLTVMMKHGYIGEFEIVDDHRNGKIVVNLTGRINKCGVISPRFDVALRD LETWTTNLLPSRQFGFLVLTTSGGIMDHEEARRKHLGGKILGFFF >Contig153 27 1 184 ; translated AILTADGAAAGHHAPHVLYNFFAIDLGXCTVDDAVGFIXALVYRDVVLYRNAPDNVYVAV >Contig152 193 1 544 ; translated LPVSSDQGKMPLPHYEEWGLALVGFAGAIGFNIVKRRPPYARIHMHAIGAVGGYLLGGIVHNDWERRARAEKIYIEDYVRLHPEDFVEPPPKLYKDVFYL GLQFEEHSQKKRLQINCVRVIVTSEKAEAIVNVTLVWCGQCEYGYSSFQSRSSTNATGRAILFITLSRKTLEHCEYMYS >Contig151 -4 1 311 ; translated WHCRPWAPGVLEYSVLFTRFFALINNVIHHDFLAQAGLNYIQYSSICARAVRRCLKGDFKIEALRREDSVIKTNRWEGGKVVKREAGQHVEHLPEVIQAD LV >Contig150 179 1 407 ; translated FPRYKRLRYVGTICIGVFAVIALVVAQDSTTMSGTTSGGSGTTQKSSFVMMTGIATVTVLDLEDSAVRTTTCATIAKPQPKPAVLLKGTVWKAVVIIMSS LVSVSMDNVDVRTLVEXGKMKERADICNGDIVKC >Contig149; 156 313 471 minus strand; translated MARFLNNPSSCFTLFLSYGMPLITVTMRQAANSTTMALNILDSCVGTVLNTNQ I'm trying to write a script that will count the number of characters per line that doesn't contain a ">" symbol and give me an average of those values. I have most of the script together but I can't figure out how to connect some of the steps. Can anyone help? Thanks!!! Code:
grep -v \> Chaetoderma_nitidulum | wc -m | ?? divide outout of wc -m by value returned by this command: grep -c \> file.txt ?? |
Well - you could try just 1 command - awk is quite powerful....
Code:
awk '!/^>/{ lines++; total+= length($1) } END { average=total/lines; printf("\n Average %2d\n", average);} ' |
Quote:
Also, very important -- have you considered writing a formal script to deal with this? Doing it in a single line is going to be very difficult, especially the part where you test and then have to edit the line using limited Bash editing features. |
adamben, that did it!
Sorry if I was unclear. I wanted to get the average of the number of characters per line (not including lines containing ">"). I figured I was just missing a simple step in what I was trying to do above so I wasn't going to worry about writing a script file. Thanks! Kevin |
All times are GMT -5. The time now is 07:31 AM. |