[SOLVED] sed loop gives unexpected results

jgombos · 09-08-2011, 01:53 AM

Works fine without a loop:

Code:

$ echo -e '1\n2\n3\n4' | sed -ne '/1/,/3/p'
1
2
3

The same output is expected with a loop added as follows:

Code:

echo -e '1\n2\n3\n4' | sed -ne '/1/,/3/{;:loop;N;/[^3]/b loop;p;}'

But there is no output. What am I missing?

grail · 09-08-2011, 02:13 AM

Maybe you could explain what you think should be printed? Seems to be working correctly to me.

jgombos · 09-08-2011, 02:47 AM

Quote:

Originally Posted by grail

Maybe you could explain what you think should be printed? Seems to be working correctly to me.

The loop eventually exits. When it exits, there is a "p" instruction, at which point the pattern space ("1\n2\n3") must be printed. Yet nothing prints.

*edit*
I solved my problem. A better example would have been:

Code:

echo -e '1\n2\n3\n4' | sed -ne '/1/{;:loop;N;/^[^3]*$/b loop;p;}'

Although I'm still not clear on why the last instruction is skipped.

grail · 09-08-2011, 04:26 AM

Quote:

Although I'm still not clear on why the last instruction is skipped.

Maybe you could explain what you mean here? If by the last instruction you are referring to 'p', if it were skipped you would see nothing at all as you have used '-n'.

The reason I asked you to explain is so you could tell us what you think is happening in the script?
Your second is not a better example but rather one that has worked with the expression you have built. The idea is to build the expression to fit the data and not
the other way around (as the real world would have the data that needs to be manipulated)

Code:

echo -e '1\n2\n3\n4' | sed -ne '/1/,/3/{;:loop;N;/[^3]/b loop;p;}'

-n :- no output unless told to print (won't worry about -e as it serves no purpose)

/1/,/3/ :- only perform the following actions for lines in this range (ie. ignore the row with a 4 on it) { 1 is first number in pattern buffer }

:loop :- start of loop

N :- get the next line into the pattern buffer {pattern now holds '1\n2'}

/[^3]/b loop :- if pattern space contains something which is not a 3 redo loop (this true as pattern space does have a non-3 value in it, so go to start of loop)

N :- get the next line into the pattern buffer {pattern now holds '1\n2\n3'}

So your loop chews up all your data and by the time you get to 'p' there is nothing left to print

Whilst your second example works it is still not doing what you think. It continues the loop until the pattern buffer holds :- '1\n2\n3' and as a 3 is at the end which breaks
the loop from being true it then prints.

jgombos · 09-08-2011, 06:01 AM

grail- I appreciate your detailed reply. I struggle with this comment:

Quote:

Originally Posted by grail

N :- get the next line into the pattern buffer {pattern now holds '1\n2\n3'}

So your loop chews up all your data and by the time you get to 'p' there is nothing left to print

I can see that all the data is consumed by the pattern buffer. However, I expect the pattern buffer to print when the loop exits and "p" executes. E.g. I don't see why the "p" is skipped in this case:

Code:

echo -e '1\n2\n' | sed -n '/1/{;:loop;N;/./b loop;p;}'

but not in this case:

Code:

echo -e '1\n2\n' | sed -n '/1/{;:loop;N;/^[^2]*$/b loop;p;}'
1
2

In both those cases the loop terminates with something (everything?) consumed by the buffer, so I think "p" should print something in both cases.

Or is the loop infinite in the first case, and sed simply times out without sending a failing exit status?

grail · 09-08-2011, 06:54 AM

Yes the first gets to the end of the data and never meets the criteria set down so as it has reached the end of the file when it gets to the 'p' it has already
exhausted all the data so there is nothing to print. Whereas your second example gets to a situation where it is true that there is now a '2' before the end of the line
so it leaves the loop and progresses to the next instruction, ie. 'p'

ta0kira · 09-08-2011, 10:39 AM

What about this:

Code:

echo -e '1\n2\n3\n4' | sed -e ':loop;N;/3/!b loop;!d;q'

Kevin Barry

crts · 09-08-2011, 12:12 PM

Quote:

Originally Posted by ta0kira

What about this:

Code:

echo -e '1\n2\n3\n4' | sed -e ':loop;N;/3/!b loop;!d;q'

Kevin Barry

Hi,

can you explain why the marked part should be there?

ta0kira · 09-08-2011, 08:31 PM

Quote:

Originally Posted by crts

Hi,

can you explain why the marked part should be there?

It probably shouldn't be. I think I added it as a replacement for p when I removed -n and didn't really think about it.
Kevin Barry

Reuti · 09-09-2011, 06:08 AM

Quote:

Originally Posted by grail

...
N :- get the next line into the pattern buffer {pattern now holds '1\n2\n3'}

So your loop chews up all your data and by the time you get to 'p' there is nothing left to print
...

Does it really get to the 'p'? I would phrase it more like: if 'N' can’t read anything, execution will be stopped. When I vary the initial script:

Code:

$ echo -e '1\n2\n3\n4\n5' | sed -ne '/1/,/3/{:loop;N;p;/[^3]/b loop;p;}'
1
2
1
2
3
1
2
3
4
1
2
3
4
5

As it reads inside the loop, the outside range specification /1/,/3/ isn’t used at all. It would only be used if the {} exits and a new record needs to be feed into the script.

jgombos · 09-09-2011, 06:56 AM

Quote:

Originally Posted by Reuti

I would phrase it more like: if 'N' cant read anything, execution will be stopped.

Bingo. That was ultimately the source of my problems (or more accurately: it was the thing preventing me from discovering my problems). Sed croaks when "N" runs out of food. This is mentioned on page 110 of "sed & awk" o'reilly 2nd ed. I suspect it's a crude protection from infinite loops. It's too bad this is not treated as an error condition. Exit status is normal.

grail · 09-09-2011, 07:26 AM

It is treated as a normal exit as it reached the end of data but none of the expressions terms were met to result in output. This is not an error.

crts · 09-09-2011, 10:50 AM

Quote:

Originally Posted by grail

It is treated as a normal exit as it reached the end of data but none of the expressions terms were met to result in output. This is not an error.

That is very true. Just because the last line has been read it does not mean that you always want to print it.

If the further execution of your script depends on some specific modifications that sed has done or not then you can set the exit status manually; this requires GNU sed, though.

Building on Kevin's example:

Code:

$ echo -e '1\n2\n3\n4' | sed -e ':loop;N;${/3/q;Q 99};/3/!b loop;q'
1
2
3
$ echo $?
0
$ echo -e '1\n2\n3' | sed -e ':loop;N;${/3/q;Q 99};/3/!b loop;q'
1
2
3
$ echo $?
0
$ echo -e '1\n2\n4' | sed -e ':loop;N;${/3/q;Q 99};/3/!b loop;q'
$ echo $?
99

jgombos · 09-10-2011, 12:33 AM

Quote:

Originally Posted by grail

It is treated as a normal exit as it reached the end of data but none of the expressions terms were met to result in output. This is not an error.

Correct, it's not an error -- that's actually the problem.

It's indeed a poor language choice to take a construct designed for data manipulation, and also give it flow control. It would be like creating a C "scanf" function that behaves like a goto statement in some obscure circumstances. And worse, these are circumstances that arise from inadvertent programming.

In fact having been burnt by this strange behavior and now knowing full well how "N" behaves, I would never deliberately use the N command for flow control. It would be an abuse; it would confuse other readers with poorly constructed code.

Sure, goto statements are appropriate for a quick and dirty language like sed, but this is effectively a goto nowhere -- that is, a goto without a destination label, and the goto statement itself is hiding within a construct that's used to move data.