sum up values from each columns (awk)
I have different files with variable numbers of columns.
What I try to do is to output the values of each columns: - with the single value if it is the same in the entire column, - or the values separated by "," if it exist different ones in the same column example 1: input: 1|2|3|4 1|1|3|4 1|2|3|3 output: 1|1,2|3|3,4 example 2: input: 1|2 1|3 7|2 output: 1,7|2,3 (the order of numbers separated by "," doesn't matter) Thanks for your help ! |
you can define associative arrays for every column: col1, col2, col3, col4. line is automatically split into $1, $2, $3 ... (-F\| is used to define separator). col1[$1] = 1; col2[$2] = 2 ... will define the element of arrays. finally (using END) you need to print the defined elements of those arrays
_____________________________________ If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post. Happy with solution ... mark as SOLVED (located in the "thread tools") |
Hi Pan 64, thanks for your help !
I am not sure to understand the term "associative array". Is it the same as an array that you define when using the split function? If we look at the first input I wrote: Code:
1|2|3 Code:
col1[$1]|col2[$1]|col3[$1] |
see here, you will find some useful tips
_____________________________________ If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post. Happy with solution ... mark as SOLVED (located in the "thread tools") |
ok, associative arrays are still not very clear for me, but maybe something that would look like that:
Code:
BEGIN{FS=OFS="|"} |
that's fine, you are almost ready.
Code:
function printarray (a) _____________________________________ If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post. Happy with solution ... mark as SOLVED (located in the "thread tools") |
An associative array is often referred to as a hash table, although strictly speaking one is an implementation of the other
https://en.wikipedia.org/wiki/Associative_array Quote:
Quote:
|
Quote:
- What is the point of "function printarray (a)" ? - Why we don't mention fields? - Does "length" refers to the length of array a, in other words the number of lines in the column? |
this function will print the list of the values of the array passed to the func (separated by a comma). We do not need fields, records and other funny things, just a for loop. length is a built-in function (of awk), it will return the length of a string or the length of an array (see man page).
_____________________________________ If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post. Happy with solution ... mark as SOLVED |
I don't really see where I am going with that.
I need to learn a bit more I guess. Thanks anyway ! |
the function is used to print col1, col2, col3 and col4, the four arrays where you collected your data
_____________________________________ If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post. Happy with solution ... mark as SOLVED |
All times are GMT -5. The time now is 12:31 AM. |