ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
In $InFile, each person has 1 or more entries per month. I want to sum each column starting from column 3, so that each person has only one entry per month.
if you where to search this sight/forum you're bound to find your answer in here already hashed out and gone over many times. as, I've seen this question before.
BIG HINT:
just use your title of your post for the search pattern in this forums search.
It's probably not optimized but here is one of multiple solutions:
Code:
awk '
BEGIN
{
FS=OFS=";"
}
{
if((date=="" && name=="")||($1 == date && $2 == name))
{
date=$1; name=$2; for(i=3;i<=NF;i++){sum[i]+=$i;}
}
else
{
printf "%s;%s;",date,name; date=$1; name=$2;
for(value in sum){printf "%d;",sum[value];sum[value]=""}
printf "\n";
for(i=3;i<=NF;i++){sum[i]+=$i;}}
}
END
{
printf "%s;%s;",date,name; for(value in sum){printf "%d;",sum[value]}
}
' input.txt
EDIT: yes, as said by BW-userx, there are already a lot of LQ threads about awk/sum and so on... So you have already all the material/logic you need in the history...
That's using awk like it was a traditional language - I am quite taken by associative arrays having come by them late in my programming experience. I would prefer something like this.
Code:
{for(i=3;i<=NF;i++){sum[$1";"$2][i]+=$i}}
No need to worry about the testing.
In the END block the key can be printed to get the first 2 fields ...
awk -F ";" \
'{RowDes=$01 FS $02;
# Build an array of unique Row Designators.
# A Row Designator is of the form yyyymm;PersonalName
# For example: 201810;Edwin
RowDesList[RowDes]=RowDes;
# Remember the field count of the longest InFile line.
if (LongestLine<NF) LongestLine=NF
# Accumulate the Sums for each distinct row designator.
# Sum is a two-dimensional array.
for (Month=3;Month<=NF;Month++) Sum[RowDes,Month]+=$Month}
# At this point all Sums have been calculated.
END{nRows=asort(RowDesList);
for (j=1;j<=nRows;j++)
# Each Output Line starts with a Row Designator.
{RowDes=OL=RowDesList[j];
# Build the Output Line by appending the Sums one month at a time.
for (Month=3;Month<=LongestLine;Month++)
OL=OL FS Sum[RowDes,Month]
# Get rid of any trailing semicolons.
gsub(/;*$/,"",OL)
# Print the Output Line.
print OL}}' $InFile >$OutFile
BW-userx: Thanks. I searched both this forum and others for days, without finding any good examples to use. But thanks for the tip!
l0f4r0: Thanks. I tried your code, but I found that it made wrong calculations... could't figure out what it actual calculated either...?
syg00: I guess your code could work if you want to add the total of columns 3++, but in my example I need the values based on columns 1 and 2 also.
danielbmartin! You are awesome! Thanks you so much! Your code is exactly what I needed!! And the explanations are great. That will help me to understand better what is going on.
Just wanted to show how awesome Miller is. I'm deliberately using Miller 6 options here: Miller is now rewritten in Go, so trying out the latest release is a piece of cake.
Code:
mlr -cNM --fs \; --ragged stats1 -a sum --grfx '^[12]$'
Another thought: many decades ago, a programmer named Larry Wall looked at "awk" and wasn't satisfied with what he saw. So, he did the inevitable thing: he invented an improved language and called it, "Perl." It was and still is very good at "doing complicated manipulations of really-big files." It still provides the reference implementation definition of "regular expressions." It also has a truly enormous contributed library of thoroughly-tested modules which test themselves on your hardware when you install them. Yet, its syntax and shortcuts will be very familiar to "awk" users.
Therefore – while "awk" is a very fine tool, there are today many tools available in your tool-box. Get to know several of them. Perl is only one.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.