LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Help understanding "awk" code (http://www.linuxquestions.org/questions/linux-newbie-8/help-understanding-awk-code-4175438260/)

shivaa 11-22-2012 06:42 AM

Help understanding "awk" code
 
Hello, I have a script, containing following awk code:
Code:


Something...
Something....

grep " RESULT " ${IFILE} \
awk 'BEGIN{sum=0; cat1=0; cat2=0; cat3=0;}
{sum++}
/value=0\.[^0]/{cat2++;}                  ## 4th line
/value=0\.0/{cat3++;}                      ## 5th line
END{cat1=sum-(cat2+cat3); print cat1, cat2, cat3;}'

Something...
Something....

And this script returns output as:
Code:

458 0 0
I can understand this awk code, but I couldn't understand it's 4th and 5th lines. I am a beginnner in awk, so could you help me, how it's calculating values of cat1, cat2, and cat3. It may be little time taking for you, but am expecting a well explained answer. Thanks a bunch!

Note: ${IFILE} file mentioned in code, contains values like:
Code:

value=0
value=0.01
value=0.01
value=0 and so on...


druuna 11-22-2012 06:50 AM

Code:

/value=0\.[^0]/{cat2++;}
increment cat2 if not value=0.0
[^0] -> all but a zero

Code:

/value=0\.0/{cat3++;}
increment cat3 if value=0.0

BTW: Please stop using large font, there's really no need for that.

acid_kewpie 11-22-2012 06:52 AM

I presume the confusion is from teh lack of formal structures around the confiditional operations there.

/value=0\.0/{cat3++;}

means something like

Code:

if {string matches regex "value=0\.0" )
{
  cat3 = cat3 + 1
}

in pseudo code. as they both end up as zero, those regexs never match

shivaa 11-22-2012 11:45 AM

Code:

grep " RESULT " ${IFILE} \
awk 'BEGIN{sum=0; cat1=0; cat2=0; cat3=0;}
{sum++}
/value=0\.[^0]/{cat2++;}                  ## 4th line
/value=0\.0/{cat3++;}                      ## 5th line
END{cat1=sum-(cat2+cat3); print cat1, cat2, cat3;}'

It means, first it intialized all sum, cat1, cat2, cat3 to 0, then in 3rd line, it increases value of sum by one i.e sum=1. then what happens in next 4th line? Is it searching for values that are 0.0 and incrementing cat2 by 1? What's relation between pattern i.e. /value=0\.[^0]/ and action {cat2++;} in this line? Could you explain little more...

acid_kewpie 11-22-2012 11:52 AM

I think my explanation covers that just fine.

shivaa 11-22-2012 12:42 PM

Quote:

Originally Posted by acid_kewpie (Post 4834903)
I think my explanation covers that just fine.

Apparently you're right & I understood what you explained, but my question is, what's it doing in 4th line in pattern portion? Is it searching for all values that contains or begin with 0.0 and if it finds any such matching value then adding 1 count to cat2? Am I correct?

BTW, problem is that I have got an old script created by someone who left the job, and I have been given a task to write a new script which should have same functionality like this script doing, and that is why I want to fully understands its code.

druuna 11-22-2012 01:53 PM

Would this help;
Code:

awk 'BEGIN{sum=0; cat1=0; cat2=0; cat3=0;}
{sum++}
/value=0\.[^0]/{cat2++;}
/value=0\.0/{cat3++;}

END{cat1=sum-(cat2+cat3); print cat1, cat2, cat3;}'

The blue part, setting some counters, is done once, when awk starts and before any input is parsed.

The green part is done for every line of input (lines that contain RESULT (with extra spaces)).
sum++ line -> counts all the lines it gets (total number of lines)
cat2++ line -> counts all the lines that contain value=0.<whatver> as long as it is _not_ value=0.0
cat3++ line ->counts all the lines that contain value=0.0

the brown part, some calculations and then printing the 3 entries, is done when all the lines are parsed.

Quote:

BTW, problem is that I have got an old script created by someone who left the job, and I have been given a task to write a new script which should have same functionality like this script doing, and that is why I want to fully understands its code.
The snippet of code posted in post #1 is in need of some re-writing.

- What you posted will not work (grep " RESULT " ${IFILE} \ should be grep " RESULT " ${IFILE} | \
- No need for the grep part, this can be done by awk
- the sum counter isn't needed in this case (the awk NR variable can be used)

That's without knowing what exactly needs to be done with the lines it gets.

Some resources that might help:

Bash resources:
Sed/Awk resources:
General resources:

shivaa 11-22-2012 08:53 PM

Special thanks to @druuna. You've explained it very well. One more last question:-
Can we devide this code in two parts (for understanding purpose) as follow:
Part-1:
Code:

grep " RESULT " ${IFILE} \
awk 'BEGIN{sum=0; cat1=0; cat2=0; cat3=0;}
{sum++}
/value=0\.[^0]/{cat2++;}                  ## 4th line
/value=0\.0/{cat3++;}                      ## 5th line

Part-2:
Code:

END{cat1=sum-(cat2+cat3); print cat1, cat2, cat3;}'
Then can we say?
(1) In first part, it's searching for specified patterns & then counting values. And after finishing, it simply prints those values in second part of the code?
(2) Is it looping between BEGIN and END of the code? Or calculating all specified patterns in one shot, e.g. in 4th line it searches for all values that are equal to 0.0<whatever>, and go to next line?
(3) After calculating values e.g. in 4th line, does it store that total connt in cat2? Or is it like a loop in which it search for a pattern, get pattern, and add 1 count?

acid_kewpie 11-23-2012 01:38 AM

the BEGIN{} block runs before any of the input data is processed, the END{} block runs afterwards, everything else in the main {} block is executed on a per input line basis. so after each input line is read, the 4th line does get executed completely, but of course the value of cat2 persists over these executions, increasing each time it is matched.

linosaurusroot 11-23-2012 05:43 AM

Code:

BEGIN{sum=0; cat1=0; cat2=0; cat3=0;}
A BEGIN block like that is unnecessary as variables default to 0 and can be used without initialisation.

shivaa 12-01-2012 06:25 AM

Many thanks everyone, my purpose of asking this question has solved.
Special thanks to @druuna for always being helpful :)

David the H. 12-02-2012 02:40 PM

Quote:

Originally Posted by linosaurusroot (Post 4835390)
Code:

BEGIN{sum=0; cat1=0; cat2=0; cat3=0;}
A BEGIN block like that is unnecessary as variables default to 0 and can be used without initialisation.

While this is mostly correct, in truth uninitialized variables are only treated as having a value of zero in math operations. They are still actually null in regards to other operations. This means that there are times where initializing them first is required.

I had a case just the other day where I needed it to print an actual "0" if the variable never incremented, which it won't do unless explicitly set first. I think the OP code here may come up against the same issue.

Speaking of which, the above section can be shortened a bit:

Code:

BEGIN{ sum=cat1=cat2=cat3=0 }


All times are GMT -5. The time now is 10:39 AM.