ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Form #1 works fine on my MEPIS 3.3.2 GNU/Linux box running bash, although I would have used the form:
sort -uk 2,2 goodtest
Are you using a different ver. of sort?:
$ sort --version
sort (coreutils) 5.2.1
Written by Mike Haertel and Paul Eggert.
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Sorry, I pretty much use only bash & definitely only GNU/Linux, so I have no idea if you're having a korn or hp-ux problem.
FWIW, I think this is what's happening:
x is an array.
Its indices are values of $2.
Every time it "sees" a $2, the '++' increments the value associated w/ that index, creating a new array element if necessary.
This works because awk arrays can have non-numeric indexing, like a hash.
This may seem backward -- the indices are strings & the array elements are numbers.
The part that I 'm not sure about is why the logical negation, '!', makes it work -- w/ it in, only the 1st instance of a value for $2 prints; remove it, & everything after the 1st instance prints. I suspect that the '!' is operating on the logical value of the array element "x[$2]", which, before it is created by the '++', is false. See: http://www.gnu.org/software/gawk/man...l#Truth-Values. Note: only the trailing '++', "post-decrement", will work.
I believe he is also relying on an implicit "print $0" when no other action is specified.
thank you very much for taking the time to explain this. My programming skills are rather weak. So "!x[column value] then print $0" means that the first time around x is nothing so when x is compared against the value in [column value] it will be true since x is not equal to 06BD hence it goes to stdout. Now the second time around did x get assigned the value from [column value] and now when x is tested against [column value] it equal to 06BD and therefore is not printed to stdout?
x[$2] is an element of that array -- one such element is created for each unique $2.
In addition to being a field value in your input, each $2 is an index of the array.
The value of the array element x[$2] is the number of lines that contain that particular $2. This results from x[$2] being incremented each time the script reads a line that contains that particular $2.
The first time the script reads a line containing some new $2, x[$2] tests false because it is empty, as yet undefined. (It is then incremented to '1', after it is tested). The '!' negates the false to true, & the std. awk default action, print $0, is performed. (Print $0 means print the whole line).
The values of the array x are not the values of $2, but the number of occurrences of those values. Even though they are strings, the $2's are indices (names) of the elements of x.
archtoad6 ..i think i need to do more reading on associative arrays and awk. I am still struggling to understand. But i was playing around with sort and found this command to work, althogh i am still not clear what "2.2b" option does. I know that the first 2 means second filed but have not clue about 2b. Here is what worked for me:
Would mean "Sort <filename> on the 2nd field only, starting w/ the 2nd character (origin 1), ignoring blanks; & show only unique lines". To see if it might be different for you, I suggest you: a) post the ver. of your sort & b) check your man page -- it may be different from mine.