Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a file file1, containing 50000 entries (numerical floating point numbers only).
And I am doing:
1. Sum of total no. of lines
2. Sum of lines containing values (i.e. val1) that are less than 1
3. Sum of lines containing values (i.e. val2) that are greater than 1
4. Percentage of both val1 and val2.
And I did (Note: All below code is part of a script which is generating file1):
It's fine upto this. But I want to combine both val1 and perctg1 commands in a one-liner awk code. I tried, but perhaps making some syntax mistake and I have no clue! So any suggestions that how can I combine them?
BTW, LQ has always been so helpful to me. Infact, I am in learning phase of awk, so I could applied what I've learned so far. But still expecting your help again
Last edited by shivaa; 12-08-2012 at 01:24 PM.
Reason: Error rectified
As druuna has abley answered the important question, just let me add a correction in wasted code:
Code:
# below is a useless use of echo
prctg2=$(echo | awk "{print $val2*100/$sum}")
prctg2=$(awk 'BEGIN{print $val2*100/$sum}' val2=$val2 sum=$sum)
# the above negates the problem of letting the shell interfere with any of the data
# below is a useless use of echo
prctg2=$(echo | awk "{print $val2*100/$sum}")
prctg2=$(awk 'BEGIN{print $val2*100/$sum}' val2=$val2 sum=$sum)
# the above negates the problem of letting the shell interfere with any of the data
Hi Grail, as you said above, after invoking both bolow two cmds:
It's giving me errors, like awk: division by zero or nawk: illegal field $().. . I tried simple /usr/bin/awk as well as /usr/xpg4/bin/awk. Also could you explain the use of val2=$val2 sum=$sum after print action?
1. Can I make following changes, instead of using patterns? (assuming that infile has only numerical floating numbers):
Code:
$1 < 1 { val1++ } # less then 1
$1 > 1 { val2++ } # larger then one
Have you tried? You do need to make one of the entries look like >= or <= otherwise 1.0000 won't be detected.
Quote:
Originally Posted by shivaa
2. (Please do not mind if I ask that.. ) Does /^[\.0]\./ means all values starting with .0? And what does \./ means here... all values that are .0. ??
Code:
^[\.0]\.
Values that start with a dot OR a 0 (zero) followed by a dot.
I do believe a I made a mistake in the original regexp, but it works for your data because all the entries seem to be starting with a leading zero (0.01 vs .01). It can be rewritten as:
^[\.0]\. Values that start with a dot OR a 0 (zero) followed by a dot.
Ooopps... From the beginning I am considering such patterns as .0, which actually means that values beginning either with a "." or a "0", not with .0.
For instance (please correct me, if I am wrong): ^[abc] .....Means all values beginning either with a a or b or c. It does not mean all values beginning with abc! I hope it will clear all my previous doubts as well .
Likewise, if I want to search, $1<=0.01; 0.01 < $1 < 0.1; $1 >=0.1 (i.e. 3 ranges), then also I can use such patterns using such regexp! Will sure try it.
Many thanks druuna... I am short of words! You've done a great job!!
---------------------------
Hi Grail, waiting for your response now (please refer my reply above).
For instance (please correct me, if I am wrong): ^[abc] .....Means all values beginning either with a a or b or c. It does not mean all values beginning with abc! I hope it will clear all my previous doubts as well .
[ ]
A bracket expression. Matches a single character that is contained within the brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: [abcx-z] matches "a", "b", "c", "x", "y", or "z", as does [a-cx-z].
The - character is treated as a literal character if it is the last or the first (after the ^) character within the brackets: [abc-], [-abc]. Note that backslash escapes are not allowed. The ] character can be included in a bracket expression if it is the first (after the ^) character: []abc].
It's giving me errors, like awk: division by zero or nawk: illegal field $().. . I tried simple /usr/bin/awk as well as /usr/xpg4/bin/awk. Also could you explain the use of val2=$val2 sum=$sum after print action?
I cannot vouch for nawk. I am using gawk so maybe nawk does not like the setting of variables after. You could simply try using the -v option to set them.
Placing the setting of the variables after the quoted code is just a preference I have for setting multiple variables instead of using -v several times.
I cannot vouch for nawk. I am using gawk so maybe nawk does not like the setting of variables after. You could simply try using the -v option to set them.
You'll need to use -v for gawk as well, the plain var=val form performs the assignment after the BEGIN rule has been run:
When the assignment is preceded with the -v option ... the variable is set at the very beginning, even before the BEGIN rules execute. ... Otherwise, the variable assignment is performed ... after the processing of the preceding input file argument.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.