LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-24-2016, 03:05 PM   #1
beca123456
LQ Newbie
 
Registered: Apr 2016
Posts: 10

Rep: Reputation: Disabled
Concatenate string through variable in awk


Hi,

input.tab (pipe-separated):
Code:
DATE|PRODUCTS|Customer_A|Customer_B|Customer_C
01Jan|meat:fruit:dairy|0,4:21,8:3,55|90,123:34,2:54,111|0,0:1,0:0,12
02Jan|fruit:meat:other|12,0:1,34:432,9|134,0:322,3:45,0|32,56:54,0:654,0

$1: Transaction date

$2: describe the order of the different type of products separated by ":" (change from one record to another)

$3-$NF: Customer transactions.
. Each type of products are separated by ":" and are described in $1
. Numbers at the left and right of the comma are the "purchased" and "sold" items respectively

For example, the 01Jan Customer_A:
- purchased 0 meat, 21 fruits, 3 dairies
- sold 4 meat, 8 fruits, 55 dairy

But the 02Jan Customer_A:
- purchased 12 fruits, 1 meat, 432 other
- sold 0 fruit, 34 meat, 9 other


OBJECTIVE: for each date, count the number and list the name(s) of customers who sold fruits, and append the original line:
Code:
Number_Customer|Customers|DATE|PRODUCTS|Customer_A|Customer_B|Customer_C
2.00|Customer_A_(21,8); Customer_B_(34,2)|01Jan|meat:fruit:dairy|0,4:21,8:3,55|90,123:34,2:54,111|0,0:1,0:0,12
1.00|Customer_C_(32,56)|02Jan|fruit:meat:other|12,0:1,34:432,9|134,0:322,3:45,0|32,56:54,0:654,0

MY CODE SO FAR:
Code:
gawk '
BEGIN{FS=OFS="|"}
NR==1{
	for(j=3; j<=NF; j++){
		cust_j=$j
	}
	print "Number_Customer|Customers" FS $0
} 

NR>1{
	# Identify the "fruit" data in FORMAT
	a=split($2,b,":")

	for(i=1; i<=a; i++){
		if(b[i] ~ /^fruit$/){
			index_fruit=i
		}
	}

	# Extract sold fruit (i.e. number on the right of the comma) in each "Customer_X" fields
	# Concatenate Customers in variable "string"
	for(j=3; j<=NF; j++){
		split($j,c,":")
		split(c[index_fruit],d,",")
		if(d[2] > 0){
			x+=1
			customer_j=cust_j"_("c[index_fruit]")"
		}
		else{
			x+=0
			customer_j=""
		}
		
		string=string";"customer_j
	}
	
	# Print fields
	if(x!=0){
		printf("%.2f\|%s\|%s\n",x,string,$0)
                x=0
                string=""
	}
	else{
		print "0.00" FS "-" FS $0
	}
}' input.tab
With this code I get the following output. The Customer names are wrong (it keeps only the last one of the loop), and I have extra ";"
Code:
Number_Customer|Customers|DATE|PRODUCTS|Customer_A|Customer_B|Customer_C
2.00|;Customer_C_(21,8);Customer_C_(34,2);|01Jan|meat:fruit:dairy|0,4:21,8:3,55|90,123:34,2:54,111|0,0:1,0:0,12
1.00|;;;Customer_C_(32,56)|02Jan|fruit:meat:other|12,0:1,34:432,9|134,0:322,3:45,0|32,56:54,0:654,0
I don't get why in $2 the customer names are wrong but the figures are correct.
Probably because the NR==1 block keeps the last iteration of the loop...

Last edited by beca123456; 11-24-2016 at 06:37 PM.
 
Old 11-25-2016, 03:13 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,976

Rep: Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181Reputation: 3181
I am not sure why you have the 'else' when you are trying to build the customer string as you only want to add the customer when the sold value is greater than zero.

I ended up building my own to see where we differ. Your problem is with your original for loop:
Code:
for(j=3; j<=NF; j++){
  cust_j=$j
}
Here, cust_j will always end up being equal to the last customer name as it is a static value. Try making it an array

Here is mine as a comparison:
Code:
BEGIN{ FS=OFS="|" }

NR == 1{
  for(i = 3; i <= NF; i++)
    cust[i] = $i

  print "Number_Customer|Customers",$0
}

NR > 1{
  n = split($2, f, ":")

  for(i = 1; i <= n; i++)
    if(f[i] == "fruit")
      pos = i 

  for(i = 3; i <= NF; i++){
    split($i, s, "[:,]")
    if(s[pos * 2] > 0){ 
      custs = custs (custs != "" ?";":"")cust[i]"_("s[pos * 2 - 1]","s[pos * 2]")"
      n_cust++
    }   
  }

  if( custs != "" )
    print n_cust,custs,$0

  custs = ""
  n_cust = 0 
}
And here is my output:
Code:
Number_Customer|Customers|DATE|PRODUCTS|Customer_A|Customer_B|Customer_C
2|Customer_A_(21,8);Customer_B_(34,2)|01Jan|meat:fruit:dairy|0,4:21,8:3,55|90,123:34,2:54,111|0,0:1,0:0,12
1|Customer_C_(32,56)|02Jan|fruit:meat:other|12,0:1,34:432,9|134,0:322,3:45,0|32,56:54,0:654,0
 
1 members found this post helpful.
  


Reply

Tags
awk, concatenate


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Concatenate variable w/ path string ... spaghettios Linux - Newbie 4 07-22-2012 05:50 PM
Sed/awk/grep search for number string of variable length in text file Alexr Linux - Newbie 10 01-19-2010 01:34 PM
Need shell script to concatenate a string and a variable into a directory name AwesomeMachine Linux - Newbie 2 05-07-2006 03:42 AM
[Bash] Concatenate string using awk senorsnor Programming 7 05-05-2005 12:38 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:33 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration