LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Script to count # of chars per line (if line meets certain criteria) and get avg #? (https://www.linuxquestions.org/questions/linux-newbie-8/script-to-count-of-chars-per-line-if-line-meets-certain-criteria-and-get-avg-754548/)

kmkocot 09-11-2009 10:42 PM

Script to count # of chars per line (if line meets certain criteria) and get avg #?
 
Hi all,

I have several files with many lines something like this:

>Contig155; 141 117 387 minus strand; translated
MVRQEVGRYSYNPLAQCRCFTGLRKLGNLISTEVEFTLVSIPKPCSISVLGTLWVQNLGTHTFLTGAGQSQGRDKKKGDDAGFHVVDVR
>Contig154 188 1 440 ; translated
SFSSFLHSHLEESFKMVRMNVLADALKSICNAEKRGKRQVLIRPCSKVIVKFLTVMMKHGYIGEFEIVDDHRNGKIVVNLTGRINKCGVISPRFDVALRD LETWTTNLLPSRQFGFLVLTTSGGIMDHEEARRKHLGGKILGFFF
>Contig153 27 1 184 ; translated
AILTADGAAAGHHAPHVLYNFFAIDLGXCTVDDAVGFIXALVYRDVVLYRNAPDNVYVAV
>Contig152 193 1 544 ; translated
LPVSSDQGKMPLPHYEEWGLALVGFAGAIGFNIVKRRPPYARIHMHAIGAVGGYLLGGIVHNDWERRARAEKIYIEDYVRLHPEDFVEPPPKLYKDVFYL GLQFEEHSQKKRLQINCVRVIVTSEKAEAIVNVTLVWCGQCEYGYSSFQSRSSTNATGRAILFITLSRKTLEHCEYMYS
>Contig151 -4 1 311 ; translated
WHCRPWAPGVLEYSVLFTRFFALINNVIHHDFLAQAGLNYIQYSSICARAVRRCLKGDFKIEALRREDSVIKTNRWEGGKVVKREAGQHVEHLPEVIQAD LV
>Contig150 179 1 407 ; translated
FPRYKRLRYVGTICIGVFAVIALVVAQDSTTMSGTTSGGSGTTQKSSFVMMTGIATVTVLDLEDSAVRTTTCATIAKPQPKPAVLLKGTVWKAVVIIMSS LVSVSMDNVDVRTLVEXGKMKERADICNGDIVKC
>Contig149; 156 313 471 minus strand; translated
MARFLNNPSSCFTLFLSYGMPLITVTMRQAANSTTMALNILDSCVGTVLNTNQ

I'm trying to write a script that will count the number of characters per line that doesn't contain a ">" symbol and give me an average of those values. I have most of the script together but I can't figure out how to connect some of the steps. Can anyone help? Thanks!!!

Code:

grep -v \> Chaetoderma_nitidulum | wc -m | ?? divide outout of wc -m by value returned by this command: grep -c \> file.txt ??

adamben 09-12-2009 12:09 AM

Well - you could try just 1 command - awk is quite powerful....

Code:

awk '!/^>/{ lines++; total+= length($1) } END { average=total/lines; printf("\n Average %2d\n", average);} '

lutusp 09-12-2009 12:10 AM

Quote:

Originally Posted by kmkocot (Post 3679556)
Hi all,

I have several files with many lines something like this:

[ snip ... ]

I'm trying to write a script that will count the number of characters per line that doesn't contain a ">" symbol and give me an average of those values. I have most of the script together but I can't figure out how to connect some of the steps. Can anyone help? Thanks!!!

Code:

grep -v \> Chaetoderma_nitidulum | wc -m | ?? divide outout of wc -m by value returned by this command: grep -c \> file.txt ??

Please say what "average of those values" means. The average of the ASCII character code numbers for the characters? And do you want an average for each line, or for the entire file?

Also, very important -- have you considered writing a formal script to deal with this? Doing it in a single line is going to be very difficult, especially the part where you test and then have to edit the line using limited Bash editing features.

kmkocot 09-13-2009 11:05 AM

adamben, that did it!

Sorry if I was unclear. I wanted to get the average of the number of characters per line (not including lines containing ">"). I figured I was just missing a simple step in what I was trying to do above so I wasn't going to worry about writing a script file.

Thanks!
Kevin


All times are GMT -5. The time now is 07:31 AM.