LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-04-2015, 04:24 PM   #1
High-T
LQ Newbie
 
Registered: Feb 2015
Posts: 3

Rep: Reputation: Disabled
Smile Awk - Process a set of records if field $5 of line 01 is 'W', otherwise copy set to o


Hi guys,

I am looking to write a script where I need to process many sets of transactions.
I want to process the set if $ 1 == "01" field $5 = "W", and $ 1 == "07" field $3 = "YY" otherwise copy set to output.

Example of the input file:

Code:
01 08 77 78 W 9890
02 08 66 68 0 8554
07 08 YY 85 9 7545
01 08 99 87 X 8787
04 09 85 85 4 8758
09 87 88 78 7 6584
10 84 ZZ 99 8 9887
A new set is always starting with $1 == "01".
Script should only process first set because its 5th value is "W" and put "MATCHED" in the end.
and copy the unmatched set "X" as it is to output.

Code:
Example of output file:
01 08 77 78 W 9890
02 08 66 68 0 8554
07 08 YY 85 9 7545
MATCHED
01 08 99 87 X 8787
04 09 85 85 4 8758
09 87 88 78 7 6584
10 84 ZZ 99 8 9887
and so on..
thanks for your help

Last edited by High-T; 02-04-2015 at 04:26 PM.
 
Old 02-04-2015, 04:42 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
We are here to help, not write your wants.
Show us what you came up with and the problem(s) you're having, and maybe someone can help.

Should be reasonably straightforward in awk.
 
Old 02-04-2015, 04:52 PM   #3
High-T
LQ Newbie
 
Registered: Feb 2015
Posts: 3

Original Poster
Rep: Reputation: Disabled
Script

I am using this script.
This script is creating a matching key from File1.txt and comparing with File2.txt and if find matches putting MATCHED. But I need to put additional filtering to check if the type is W and line 07 has YY.

So the following script would execute the code when the above criteria is met (i.e. W and YY) If not then copy the set to OUT file.
I did not wanted to complicate things so thats why I did not put the script I am using.
I just wanted a small code that I can fit in my script but if would be great if it helps you.

Code:
awk '	
BEGIN {
		OFS="\t"
		OUT = "File1.txt"
		
		valid_columns["01"] = "01"
		valid_columns["02"] = "02"
		valid_columns["03"] = "03"
		valid_columns["04"] = "04"
		valid_columns["05"] = "05"
		valid_columns["06"] = "06"
		valid_columns["07"] = "07"

	}

	NR == FNR {		
		if(NF)
		{
			master_key[substr($0,1,14)] = $0 
		}
		next		
	}
	! $1 in valid_first_comments{
		next
	}
	
	
	$1 == "01" {				
		line_accumulator = $0 "\n"	
		key = $4 $3 $2
    	}	
    	$1 != "01" && $1 != "07" {
    		line_accumulator = line_accumulator $0 "\n"
    	}
	$1 == "07" {
		output_line = line_accumulator $0
		
		key = key $4
		
		if ( key in master_key )
		{
			print output_line > OUT
			print "MATCHED", master_key[key] > OUT	
		}
		
	}

	END {

	}

' File2.txt File1.txt

Last edited by High-T; 02-09-2015 at 02:32 PM.
 
Old 02-04-2015, 06:36 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I suspect if you have a close look at the following, your code will get a lot further.
Code:
! $1 in valid_first_comments{
		next
}
 
Old 02-04-2015, 06:57 PM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
So you seem to have a reasonable understanding of awk ... where is your issue about checking columns other than $1?

I also do not see anything in your code that shows how records are delimited ... ie RS would need to be set to something. I would make the suggestion that this may need to be different for each file.
 
Old 02-05-2015, 11:47 AM   #6
High-T
LQ Newbie
 
Registered: Feb 2015
Posts: 3

Original Poster
Rep: Reputation: Disabled
Update on the script

Grail, i have declared that the records are delimited by tabs.

OFS="\t"

I have came up with this option. it is not finalized yet and I am working further on it.

Code:
awk '	
BEGIN {
		OFS="\t"
		OUT = "File1.txt"

		valid_order_tp = "W" 
		
		valid_tender_tp = "YY"

		valid_columns["01"] = "01"
		valid_columns["02"] = "02"
		valid_columns["03"] = "03"
		valid_columns["04"] = "04"
		valid_columns["05"] = "05"
		valid_columns["06"] = "06"
		valid_columns["07"] = "07"
	
	}

	NR == FNR {		
		if(NF)
		{
			key[substr($0,1,14)] = $0 
		}
		next		
	}
	! $1 in valid_comments{
		next
	}
	
		$1 == "01" { 
		valid_order = $5
	}
				
	$1 == "07" {
		valid_tender = $3
	}

	{if ( valid_order = valid_order_tp  && valid_tender in valid_tender_tp)
	do 

	$1 == "01" {				
		line_accumulator = $0 "\n"	
		key = $4 $3 $2
    	}	
    	$1 != "01" && $1 != "07" {
    		line_accumulator = line_accumulator $0 "\n"
    	}
	$1 == "07" {
		output_line = line_accumulator $0
		
		key = key $4
		
		if ( comparison_key in master_key )
		{
			print output_line > OUT
			print "MATCHED", master_key[key] > OUT	
		}
		
	}
done
	END {

	}

' File2.txt File1.txt

Last edited by High-T; 02-09-2015 at 02:30 PM.
 
Old 02-06-2015, 10:14 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
OFS = Output Field Separator

This will not help you in the case where you are reading the file in. What I mean by record separator is the RS variable which will allow you to say where a complete record ends, in your case
that each record should start with 01 at the start of the line.

Code:
valid_tender_field in valid_tender_type
valid_tender_type is not an array, so I would be dubious on how this part of your 'if' would work??

I also note, that at no time do you reset your variables, hence the previous '07' line value will still be set when the next '01' is hit and so the test will still be true.
In fact, as you never rest it, once you hit your desired scenario once, it will always be true.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk - from field n to end of line dazdaz Programming 8 02-24-2013 09:49 PM
[SOLVED] AWK: gsub ' or set ' as field separator cristalp Programming 8 11-10-2011 10:03 AM
[SOLVED] Need help in replacing set of characters in a specific line using sed or awk bbachu Programming 15 01-03-2011 01:01 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:02 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration