LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 06-12-2009, 07:26 AM   #1
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Rep: Reputation: 0
using variables in awk


Hi i would like to add an extra dimension to a question i previously asked...
I'm not quiet sure as to how one uses different variable inside awk.
summary: i have bad data for instance, the use of capital letters in the middle of the word.
I identified the errors, made a list and put it in a file. Some errors are checked for a condition and then depending on the result the 2nd or 3rd value has to replace the actual value in the file.

error_correction.txt

Incorrect,Correct,Maybe
VeNOM,Venom,Venemous
nos,NOS,N2O
.
.
.



My data file looks like this:
data.txt:

vgr,bugatti veron,,3.5,Maybe,6,.......,....
vgr,lamborgini,,3.5,nos,6,.......,....
abc,bugatti veron,,3.5,N20,6,.......,.......
.
.
.
.



I need to replace the terms in the 5th field with that from the list, after checking with an if condition whether to pick out the 2nd column or the 3rd column from the error_correction.txt file.
How do i do this using awk??

Reference to previous question:
http://www.linuxquestions.org/questi...-field-730433/
 
Old 06-12-2009, 07:37 AM   #2
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,488

Rep: Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956
Please, show us the awk code you're using now. Also, what is the condition to choose the 2nd or the 3rd field from error_correction.txt?
 
Old 06-12-2009, 08:10 AM   #3
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
currently this is the simplest form of the code: [http://www.linuxquestions.org/questi...field-730433/]

current code:

awk -F"," 'FNR==NR{a[$1]=$2;next}
( $5 in a ){
$5=a[$5]; #This assigns the value from the 2nd column of the error_correction file
}' error_correction file


i want to add this condition to the awk code:

{
x=substr($1,2,5);
}
if ( x == "JB007" )
{
#Assign the value from the 3rd column of the error_correction file
}
else
{
$5=a[$5];#Assign the value from the 2nd column of the error_correction file
}


I think that i may have to use another variable like b[$1]=$3 after the NR but i'm not quiet sure how to loop it up
 
Old 06-12-2009, 09:16 AM   #4
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
Is it possible to use functions in awk?
 
Old 06-12-2009, 09:25 AM   #5
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi,

Yes you can.

Take a look here: gawk manual - 8.2 User-Defined Functions
 
Old 06-12-2009, 09:42 AM   #6
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
cool.....
n hey is it possible to use a switch case as well?
i tried the --enable-switch but it didn't work....
 
Old 06-12-2009, 09:45 AM   #7
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi again,

From the same manual (!!): 6.4.5 The switch Statement

You could have found that one yourself
 
Old 06-12-2009, 09:54 AM   #8
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
yup
i saw it n tried it out but it didn't work
it had something about it working only on version 3.1.3 for gawk.
Do u know of any other way?
 
Old 06-12-2009, 01:37 PM   #9
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,488

Rep: Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956
If I correctly interpret your requirements (as explained in post #3 and in your previous thread) following the code posted by ghostdog74, this should do the trick:
Code:
awk 'BEGIN{ FS=","; OFS=","
}

FNR == NR {a[$1] = $2
           b[$1] = $3
         next
}

( $5 in a ){
  if ( substr($1,2,5) == "JB007" )
     $5 = b[$5]
  else
     $5 = a[$5]
}

FNR < NR' error_correction.txt input_file > output_file
Check it on your real example and look at the official manual suggested by druuna to correctly interpret this code.
 
Old 06-12-2009, 01:39 PM   #10
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,488

Rep: Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956Reputation: 1956
Quote:
Originally Posted by vgr12386 View Post
yup
i saw it n tried it out but it didn't work
it had something about it working only on version 3.1.3 for gawk.
Do u know of any other way?
As you've read, it is an experimental feature which is not enabled by default in previous versions. If you want to try it, you have to compile gawk from source adding --enable-switch in the configure step or eventually update to a more recent version of gawk. Anyway, I don't think you really need it, unless your requirements changed again!
 
Old 06-15-2009, 04:47 AM   #11
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
hey colicix,
guess what im back
hey im still not quiet familiar with awk!
im not sure how to compare the values of two files and pick up various columns separately.

In the code that you wrote,
Quote:
Code:
Code:

awk 'BEGIN{ FS=","; OFS=","
}

FNR == NR {a[$1] = $2
           b[$1] = $3
         next
}

( $5 in a ){ <-- what if there are multiple similar values?? like there are 3 entries of nos present in the error correction file along with an exrta column which is to matched with the data file?
  if ( substr($1,2,5) == "JB007" )
     $5 = b[$5]
  else
     $5 = a[$5]
}

FNR < NR' error_correction.txt input_file > output_file
 
Old 06-15-2009, 08:00 AM   #12
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
any one around???

Code:
( $5 in a ){ 
  if ( substr($1,2,5) == "JB007" )
     $5 = b[$5]
  else
     $5 = a[$5]
}
If $5 occurs more than once in a, how do i make it loop to search for the second occurrence?
 
Old 06-17-2009, 08:07 AM   #13
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
knock knock
 
Old 06-23-2009, 08:48 AM   #14
crabboy
Moderator
 
Registered: Feb 2001
Location: Atlanta, GA
Distribution: Slackware
Posts: 1,823

Rep: Reputation: 120Reputation: 120
vgr, you have 3 threads running regarding the same awk problems, perhaps you should check your other threads for replies and ask new questions there.
 
Old 06-24-2009, 04:19 AM   #15
vgr12386
LQ Newbie
 
Registered: Jun 2009
Posts: 16

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by crabboy View Post
vgr, you have 3 threads running regarding the same awk problems, perhaps you should check your other threads for replies and ask new questions there.
Well the questions are different it's just that i have used similar data.

All i wanted to know was how to loop through repeated values in 2 separate columns present in two different files.
What was happening in the case above was that it checked for the first occurrence only and then exited the loop
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Does awk depend on the environment variables ? first_linux Linux - Newbie 6 05-08-2009 03:17 AM
Assigning variables by awk(?) bioinformatics_guy Linux - Newbie 2 02-19-2009 12:01 PM
How to make 2 variables from one variable value in awk intikhabalam Linux - General 1 07-30-2008 04:32 AM
Can you use grep / awk on variables instead of files? zest n zeal Linux - Newbie 2 02-11-2008 12:37 PM
awk: /matching/ variables passed with -v aunquarra Linux - General 2 02-17-2005 06:47 PM


All times are GMT -5. The time now is 08:05 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration