Quote:
Originally Posted by stillsmil
where somethingstrange is a token something like a 2-2 matrix [0,0;0,1], bounded in a square.
|
First let's see what somethingstrange really is by using
od:
Code:
~$ echo abba | while read line; do m=`echo "$line" | gawk '{$0=gensub(/a(.+)a/,"c\\1c",1);print}'`;echo $m;done | od -t u1a
0000000 99 1 99 10
c soh c nl
If we look at the
Ascii Table, we see that 99 represents "c", and 10 represents "new line", and 1 represents "start of heading"; which is what the second line from
od is reminding us.
So now we know that our "somethingstrange" is just a byte equal to 1. Now let's see where it's really coming from (I'll just take the first line from od, since that's enough to see the 1):
Code:
~$ m=`echo "abba" | gawk '{$0=gensub(/a(.+)a/,"c\\1c",1);print}'`;echo $m | od -t u1 | head -1
0000000 99 1 99 10
~$ m=`gawk 'BEGIN{print "c\\1c"}'`;echo $m | od -t u1 | head -1
0000000 99 1 99 10
~$ gawk 'BEGIN{print "c\\1c"}' | od -t u1 | head -1
0000000 99 92 49 99 10
Now we see that the while loop and the gensub don't make difference, it's actually the backquote (`) substitution. To understand why, we'll use a little script that echoes its arguments in order to see what gawk is seeing.
Code:
~$ cat ~/bin/echoargs.sh
#!/bin/sh
printf '%s\n' "$@"
~$ echoargs.sh 1 2 3
1
2
3
Code:
~$ echoargs.sh 'BEGIN{print "c\\1c"}'
BEGIN{print "c\\1c"}
~$ m=`echoargs.sh 'BEGIN{print "c\\1c"}'` ; echo $m
BEGIN{print "c\1c"}
So backquote substitution appears to interpret backslash escapes before invoking the command. When gawk sees "\1" it interprets that as the byte value 1.
Note that $() substitution,
which is the recommended replacement for backquote, doesn't do this:
Code:
~$ m=$(echoargs.sh 'BEGIN{print "c\\1c"}') ; echo $m
BEGIN{print "c\\1c"}