ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I'm looking to take a CSV file which will have three fields:-
Customer Number
Balance
Payor ID
The file will have duplicate records which need merging along with the value of the second field (balance).
In theory you could have the following file:
Code:
customer, balance, payor
001, 200.00, Microsoft
002, 100.00, Apple
002, 100.00, Apple
003, 800.00, IBM
004, 250.00, Microsoft
001, 100.00, Microsoft
Please see my previous post and the excellent answers I got. This worked great with 2 fields but I need the third field (Payor ID) to be printf out with the results so it would like this..
Code:
customer, balance, payor
001, 300.00, Microsoft
002, 200.00, Apple
003, 800.00, IBM
004, 250.00, Microsoft
In cases where there is a different payor ID for the same customer id a "XXX" would need to be listed instead of a payor...
Code:
005, 500.00, XXX
Here is my current code which only printf's the two fields...
Code:
BEGIN{ FS = ","; getline }
{
# populate array with customer as index and sum of balances as values
balance[$1] = balance[$1] + $2
}
END{ # sort the indices of the array and put them in array indices
n = asorti(balance,indices)
# print the sorted array
for (i = 1; i <= n; i++)
printf "%s, %5.2f\n", indices[i], balance[indices[i]]
}
I have tried all permutations to get this to work and read the man page for awk many times so I'm desperate for some help.
However, is it possible where there are different payor ID's for the same customer id instead of the XXX could the routine prinf a new line for each payor ID associated with a customer?
So in the example below:-
Code:
customer, balance, payor
001, 200.00, Microsoft
002, 100.00, Apple
002, 100.00, Apple
003, 800.00, IBM
004, 250.00, Microsoft
001, 100.00, Microsoft
001, 100.00, 001
001, 100.00, Apple
You would get as a result:-
Code:
customer, balance, payor
001, 300.00, Microsoft
001, 100.00, 001
001, 100.00, Apple
002, 200.00, Apple
003, 800.00, IBM
004, 250.00, Microsoft
I hop this makes sense, sorry to mess you around but I thought it wouldn't be possible to do this but perhaps it is?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.