LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-15-2012, 06:18 PM   #1
Dr_Noob
LQ Newbie
 
Registered: Mar 2012
Posts: 1

Rep: Reputation: Disabled
loading a variable with awk in a for loop


SO I am writing a script that needs to take a file as input that looks like:

GENE CHR START END
GALNT10 5 153570295 153800543
KLHL32 6 97372496 97588630
FTO 16 53737875 54148379
MC4R 18 58038564 58040001
SEC16B 1 177897489 177939050
ADCY3 2 25042039 25142055
GNPDA2 4 44704168 44728612

and use that info to process another file and generate output that fall within the boundaries of START and END.

The script takes several arguments, where REGION is the file above, and FREQAA and FREQEA are additional files:

FREQAA=$1
FREQEA=$2
REGION=$3
BUFFER=$4

Everything was working fine until I put in this for loop:

for i in $(awk '{print $2}' $GENE.bim | head -$NR);do

EA1=$(grep -w -$i $FREQEA | awk '{print $3}')
EA2=$(grep -w -$i $FREQEA | awk '{print $4}')
AA1=$(grep -w -$i $FREQAA | awk '{print $3}')
AA2=$(grep -w -$i $FREQAA | awk '{print $4}')

in other similar lines earlier in the script, this type of thing works fine, but inside the for loop, instead of loading $EA1 with the third column of the line containing $i, it writes the value of the 3rd argument.
 
Old 03-16-2012, 04:48 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
I think you may have to explain further what you are attempting to do as the present code snippet makes no sense to me. Also, please use [code][/code] tags when
displaying code.

Maybe you could start by explaining where the NR variable comes from in the line below:
Code:
for i in $(awk '{print $2}' $GENE.bim | head -$NR);do
 
Old 03-17-2012, 11:50 AM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
To start with, Don't read lines with for. Use a while+read loop, with the awk command supplied by a process substitution (assuming bash, of course).


You shouldn't need to use head either. You can import the variable directly into awk and use it to output the lines you want.


Code:
while read i; do

	commands

done <( awk '( NR <= ln ) { print $2 }' "ln=$NR" "$GENE.bim" )
Edit: Similarly, you shouldn't need to use grep and awk (or grep and sed) together in the sub-commands.

Code:
EA1=$( awk '( $0 ~ pat ) { print $3 }' 'pat=\\<'"$i"'\\>' "$FREQEA" )
The "pat" variable in awk is treated as a regex to match, so that only lines containing the pattern will be printed. \< and \> are regex word boundary anchors, used to replicate the behavior of grep's "-w" option. Unfortunately though, awkward quoting and backslashing is needed to properly pass them to awk. If you don't need the whole-word matching condition, you can simply use "pat=$i".

Edit2: a slightly cleaner way to handle the regex, by adding the word boundries to the variable inside awk instead.
Code:
EA1=$( awk '{ pat="\\<" pat "\\>" } ; ( $0 ~ pat ) { print $3 }' "pat=$i" "$FREQEA" )

Speaking of which, QUOTE ALL OF YOUR VARIABLE SUBSTITUTIONS. You should never leave the quotes off a variable expansion unless you explicitly want the resulting string to be word-split by the shell. This is a vitally important concept in scripting, so train yourself to do it correctly now. You can learn about the exceptions later.

Also, environment variables are generally all upper-case. So while not absolutely necessary, it's good practice to keep your own user variables in lower-case or mixed-case, to help differentiate them.

Last edited by David the H.; 03-17-2012 at 12:14 PM. Reason: as stated
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
problem while comparing awk field variable with input variable entered using keyboard vinay007 Programming 12 08-23-2011 12:44 AM
[SOLVED] [BASH] non-empty variable before loop end, is empty after exiting loop aitor Programming 2 08-26-2010 09:57 AM
[SOLVED] awk: how can I assign value to a shell variable inside awk? quanba Programming 6 03-23-2010 02:18 AM
AWK a variable Ouptut to a new variable and using the new variable with the old one alertroshannow Linux - Newbie 4 02-16-2009 12:08 AM
awk in loop How to Nkunzis Linux - Newbie 3 12-10-2006 01:34 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:16 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration