ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Assuming you can use the code with the asorti function, as per post #2:
Code:
BEGIN{ FS = ","; getline }
{
balance[$1] = balance[$1] + $2
}
END{
n = asorti(balance,indices)
for (i = 1; i <= n; i++)
printf "%s, %5.2f\n", indices[i], balance[indices[i]]
}
you can change it to the following:
Code:
BEGIN{ FS = ","; getline }
{
balance[$1] = ( balance[$1] "," $2 )
}
END{
n = asorti(balance,indices)
for (i = 1; i <= n; i++)
printf "%s%s\n", indices[i], balance[indices[i]]
}
Here you don't sum $2, but concatenate values in a string, using comma as separator. Then you have to change the format in the printf statement, since you have to print out a string and not a floating point number. The same modifications can be applied to the other versions of the code. Hope this helps.
For the original input file it works perfectly.
However my real file is slight different... I thought it will be easy for me to modify it but it occur much more difficult.
here is a sample
Code:
# text text text
12/7/10 00:00,gg2a,15791,3372,4018,5,
12/7/10 00:00,gg2b,4961,92,31190,4,
# text2 text2 text2
12/7/10 00:00,gg2a,1.8840170106E10,3.043735864E9,1.5796434242E10,1.7081492E7,
12/7/10 00:00,gg2b,8.6964647131E10,1.1799862993E10,7.5164784138E10,7.1079514E7,
Yes, things are a bit more complicate here. First a question: based on what criteria do you merge lines? How many variants of the "gg" field may occur?
Here is a working example based on the input data in post #17:
Code:
BEGIN { FS = "," }
! /^#/ {
balance[$1 "," $2] = ( balance[$1 "," $2] "," $3 "," $4 "," $5 "," $6 )
}
END {
n = asorti(balance,indices)
for (i = 1; i <= n; i++)
printf "%s%s,\n", indices[i], balance[indices[i]]
}
First note the (negated) regular expression before the main rule: ! /^#/. This excludes every line that begins with a hash (or in other world the rule is applied to every line that does not begin with hash). The rest should be clear, as you already tried something similar.
I have some doubt about the sorting process, anyway. First, if you want to sort by date (for example from the oldest to the most recent) you should have a date format that can be naturally sorted in an alphanumeric sense. For example:
Code:
10/07/12 00:00
10/08/13 04:00
in this way the asorti function sorts strings by means of the alphanumeric order and the result is automaticaaly sorted by date. In alternative you might transform the date string in a date number (julian date), sort them numerically and finally transform them back to the original format. You can do this using awk's time functions.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.