LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Could someone please explain the concept of associative arrays in AWK programming? (https://www.linuxquestions.org/questions/linux-newbie-8/could-someone-please-explain-the-concept-of-associative-arrays-in-awk-programming-946884/)

AJAY E 05-25-2012 02:45 PM

Could someone please explain the concept of associative arrays in AWK programming?
 
Please explain the concept of associative arrays in awk programming with a few examples. I tried to find the good links about the same topic in internet and in our forum, but i did not find any good links to understand/get in-depth knowledge on the same topic. if you know any pointers to the same topic, please do provide.

Ser Olmy 05-25-2012 03:19 PM

It's really quite simple. Ordinarily, an array has a numeric index:
Code:

fruit[1] = "banana"
fruit[2] = "apple"
fruit[3] = "orange"

An associative array can use a string as a sort of index, making the array behave a bit like a key-value store:
Code:

colour["banana"] = "yellow"
colour["apple"] = "green"
colour["orange"] = "orange"

Here's the chapter on arrays from the O'Reilly sed & awk book.

David the H. 05-26-2012 08:30 AM

And here's the array section of the gawk user's guide.

http://www.gnu.org/software/gawk/man...de/Arrays.html

It should be noted that in awk, all arrays are associative. Even numbers like 1,2,3 are stored as simple text strings, not digits.

AJAY E 05-27-2012 07:55 AM

Thank you very much for your reply.
 
Quote:

Originally Posted by Ser Olmy (Post 4687667)
It's really quite simple. Ordinarily, an array has a numeric index:
Code:

fruit[1] = "banana"
fruit[2] = "apple"
fruit[3] = "orange"

An associative array can use a string as a sort of index, making the array behave a bit like a key-value store:
Code:

colour["banana"] = "yellow"
colour["apple"] = "green"
colour["orange"] = "orange"

Here's the chapter on arrays from the O'Reilly sed & awk book.


Could you please explain the behavior of the below awk code.?

awk '{ vec[$1]+=1 }
END { for (i in vec)
{print i vec[i] }'

My understanding: The vec[$1] points to the first field in the first line and it stores the same. The vec[$1]+=1
increases the value by 1 that means it proceeds to the next line and captures the first field in the second line. is it correct?.

grail 05-27-2012 08:40 AM

I would have to say the easiest process for learning something like this code would be to run it on a small file of known data and see what happens.

David the H. 05-27-2012 08:56 AM

Just understand that the "$1" will be replaced by the contents of the first field, the result of which will be used as the index string.

So think about it; what happens if the first field on two lines is the same, and what happens if they are different?

chrism01 05-27-2012 07:01 PM

@OP: you may like to know that some langs eg Perl use the term 'hash' (as in hash table, not passwd hashing) for the same technique. :)


All times are GMT -5. The time now is 10:38 PM.