Once again... awk.. awk... awk
I have a file file1, containing 50000 entries (numerical floating point numbers only).
And I am doing: 1. Sum of total no. of lines 2. Sum of lines containing values (i.e. val1) that are less than 1 3. Sum of lines containing values (i.e. val2) that are greater than 1 4. Percentage of both val1 and val2. And I did (Note: All below code is part of a script which is generating file1): Code:
#sum BTW, LQ has always been so helpful to me. Infact, I am in learning phase of awk, so I could applied what I've learned so far. But still expecting your help again :) |
So we meet again ;)
Is there a specific reason why you use multiple awk statements? All your requirements can be done with one awk statement: Code:
awk 'BEGIN{val1=0;val2=0}/^[\.0]\./{val1++}/^[1-9]/{val2++}END{ print "sum: ",NR, "val1: ", val1, "("val1*100/NR"%)", "val2: ", val2, "("val2*100/NR"%)"}' infile Code:
awk 'BEGIN{ Code:
sum: 13 val1: 7 (53.8462%) val2: 6 (46.1538%) |
As druuna has abley answered the important question, just let me add a correction in wasted code:
Code:
# below is a useless use of echo |
@shivaa: I noticed you read grail's and my reply. If this is solved can you put up the [SOLVED] tag...
BTW: If you ever do need the calculated values outside of awk you can do the following: Code:
#!/bin/bash |
Thanks @druuna and @grail. I actually have not yet tested it, that's why kept this post unsolved.
|
Code:
# below is a useless use of echo Code:
val2=$(awk '/^[1-9]/ {val2++} END{ print val2}' file1) Code:
prctg2=$(awk 'BEGIN{print $val2*100/$sum}' val2=$val2 sum=$sum) |
Code:
awk 'BEGIN{ 1. Can I make following changes, instead of using patterns? (assuming that infile has only numerical floating numbers): Code:
$1 < 1 { val1++ } # less then 1 |
Quote:
Quote:
Code:
^[\.0]\. I do believe a I made a mistake in the original regexp, but it works for your data because all the entries seem to be starting with a leading zero (0.01 vs .01). It can be rewritten as: Code:
^0\. Code:
^[1-9] |
Quote:
For instance (please correct me, if I am wrong): ^[abc] .....Means all values beginning either with a a or b or c. It does not mean all values beginning with abc! I hope it will clear all my previous doubts as well ;). Likewise, if I want to search, $1<=0.01; 0.01 < $1 < 0.1; $1 >=0.1 (i.e. 3 ranges), then also I can use such patterns using such regexp! Will sure try it. Many thanks druuna... I am short of words! You've done a great job!! --------------------------- Hi Grail, waiting for your response now (please refer my reply above). |
Quote:
You might want to revisit this site: Regex Tutorial, Examples and Reference especially: Character Classes or Character Sets And this from the wiki page: Quote:
|
Quote:
Placing the setting of the variables after the quoted code is just a preference I have for setting multiple variables instead of using -v several times. |
Quote:
Quote:
|
Thanks ntubski ... I was not aware of this variation :)
|
Many thanks @druuna & @grail!
Ciao! |
All times are GMT -5. The time now is 05:31 AM. |