LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-22-2012, 09:35 AM   #1
stillsmil
LQ Newbie
 
Registered: Jun 2012
Posts: 3

Rep: Reputation: Disabled
gawk+gensub beckreference behave differently in while loop?


Hi!
I include a gawk command in a while loop that make use of the backreference feature of gensub(). But things come out weird. I reproduced the problem:
$ echo abba | while read line; do m=`echo "$line" | gawk '{$0=gensub(/a(.+)a/,"c\\1c",1);print}'`;echo $m;done
produces:
csomethingstrangec
where somethingstrange is a token something like a 2-2 matrix [0,0;0,1], bounded in a square.

But out of a loop it works all right:
$ echo abba | gawk '{$0=gensub(/a(.+)a/,"c\\1c",1);print}'
produces:
cbbc

I reckon that the trick might be in the loop and the backreference, because "c&c" as well as other uses of awk/gawk(as I've tried) works well.

I'm a newbee to Linux and to all those shell staff, so I may be making some rather boring mistakes. Thanks in advance!
 
Old 07-08-2012, 01:31 PM   #2
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,784

Rep: Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083
Quote:
Originally Posted by stillsmil View Post
where somethingstrange is a token something like a 2-2 matrix [0,0;0,1], bounded in a square.
First let's see what somethingstrange really is by using od:
Code:
~$ echo abba | while read line; do m=`echo "$line" | gawk '{$0=gensub(/a(.+)a/,"c\\1c",1);print}'`;echo $m;done | od -t u1a
0000000  99   1  99  10
          c soh   c  nl
If we look at the Ascii Table, we see that 99 represents "c", and 10 represents "new line", and 1 represents "start of heading"; which is what the second line from od is reminding us.

So now we know that our "somethingstrange" is just a byte equal to 1. Now let's see where it's really coming from (I'll just take the first line from od, since that's enough to see the 1):
Code:
~$ m=`echo "abba" | gawk '{$0=gensub(/a(.+)a/,"c\\1c",1);print}'`;echo $m | od -t u1 | head -1
0000000  99   1  99  10
~$ m=`gawk 'BEGIN{print "c\\1c"}'`;echo $m | od -t u1 | head -1
0000000  99   1  99  10
~$ gawk 'BEGIN{print "c\\1c"}' | od -t u1 | head -1
0000000  99  92  49  99  10
Now we see that the while loop and the gensub don't make difference, it's actually the backquote (`) substitution. To understand why, we'll use a little script that echoes its arguments in order to see what gawk is seeing.

Code:
~$ cat ~/bin/echoargs.sh
#!/bin/sh
printf '%s\n' "$@"
~$ echoargs.sh 1 2 3
1
2
3
Code:
~$ echoargs.sh 'BEGIN{print "c\\1c"}'
BEGIN{print "c\\1c"}
~$ m=`echoargs.sh 'BEGIN{print "c\\1c"}'` ; echo $m
BEGIN{print "c\1c"}
So backquote substitution appears to interpret backslash escapes before invoking the command. When gawk sees "\1" it interprets that as the byte value 1.


Note that $() substitution, which is the recommended replacement for backquote, doesn't do this:
Code:
~$ m=$(echoargs.sh 'BEGIN{print "c\\1c"}') ; echo $m
BEGIN{print "c\\1c"}
 
2 members found this post helpful.
Old 07-08-2012, 08:06 PM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,008

Rep: Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193
Nice information ntubski ... I was not aware of backtick's limitation in this way
 
Old 07-08-2012, 11:53 PM   #4
stillsmil
LQ Newbie
 
Registered: Jun 2012
Posts: 3

Original Poster
Rep: Reputation: Disabled
Thank you ntubski! Problem settled and great tips for sorting things out!
 
Old 07-09-2012, 03:40 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,008

Rep: Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193Reputation: 3193
Please mark as SOLVED once you have a solution.
 
  


Reply

Tags
shell scripting



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to loop over text file lines within bash script for loop? johnpaulodonnell Linux - Newbie 9 07-28-2015 03:49 PM
Is dpkg/lock working differently in Oneiric? It didn't behave this way before Fennippee Linux - Newbie 1 10-29-2011 12:50 AM
awk - gensub first result webhope Programming 3 05-27-2010 05:52 AM
How to use variables in search pattern in gensub function of awk rajeshksv Linux - Newbie 1 08-07-2009 07:07 AM
Make mysql order by to behave differently raven Linux - Server 2 11-30-2007 03:33 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:43 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration