SED interval specification

wakatana · 10-25-2009, 05:37 AM

Hi all

input

Code:

BEGIN
some other text
containing words
such as BEGIN
and END
END
some other text
containing words
such as BEGIN
and END
BEGIN
some other text
containing words
such as BEGIN
and END
END

desired output

Code:

 
BEGIN
some other text
containing words
such as START
and EXIT
END
some other text
containing words
such as BEGIN
and END
BEGIN
some other text
containing words
such as START
and EXIT
END

In conclusion replace all BEGIN with START and END with EXIT but only within BEGIN END boundaries.

I tried that

Code:

cat begendf | sed '/^BEGIN/,/^END/ s/BEGIN/START/; s/END/EXIT/'
START
some other text
containing words
such as START
and EXIT
EXIT
some other text
containing words
such as BEGIN
and EXIT
START
some other text
containing words
such as START
and EXIT
EXIT

But as you can see also replaced boundaries (BEGIN, END to START, EXIT) and leaves unchanged only BEGIN and not END outside the boundaries.
Also tried some weird tricks using new spaces but neither working. Thanks a lot.

druuna · 10-25-2009, 06:22 AM

Hi,

This: s/BEGIN/START/; s/END/EXIT/, will find all (!!) BEGIN and END lines, including the ones in the range.

The scond part (all after the ; ) is not a part of the range specified in the first part, try this:

sed -e '/^BEGIN/,/^END/s/ BEGIN/ START/' -e '/^BEGIN/,/^END/s/ END/ EXIT/' infile

2 sed actions (made possible by the -e options), first one for the BEGIN/END part, second for the END/EXIT part. Both are bound to a specific section.

You could also rewrite it to:

sed '/^BEGIN/,/^END/{s/ BEGIN/ START/;s/ END/ EXIT/}' infile

All between the curly brackets ({...}) is bound by the section part.

Hope this helps.

wakatana · 10-25-2009, 06:43 AM

druuna thanks a lot this really helps,
I saw lot of sed examples including -e option also read man but it was still unclear what is it for, if it works without it

Now I know I should use it if I have multiple actions.
I am wondering to one thing, in example that you've posted you replace

s/ END/ EXIT/
and
s/ BEGIN/ START/

that works because of whitespace (space in this case) before matching string.
But is possible to exclude somehow the boundaries, from replacement ?
So I could rewrite it to

s/END/EXIT/
and
s/BEGIN/START/

and still be working.

Thanks

vikas027 · 10-25-2009, 06:45 AM

I have done this through a dirty way

. I am calling this dirty as this could have been done through sed/awk one liners like druuna has done, BUT my knowledge is limited to sed/awk.

Anyways, here is my script. I am assuming your file is named as file.
And your desired output would be in file2.

Code:

> /tmp/file1
while read i
do
echo $i | grep ^BEGIN
if [ $? -eq 1 ]
then
  echo $i | grep BEGIN
     if [ $? -eq 0 ]
       then
       echo $i | sed 's/BEGIN/START/' >> file1
     else
       echo $i >> file1
     fi
else
echo $i >> file1
fi
done < file


> /tmp/file2
while read i
do
echo $i | grep ^END
if [ $? -eq 1 ]
then
  echo $i | grep END
     if [ $? -eq 0 ]
       then
       echo $i | sed 's/END/EXIT/' >> file2
     else
       echo $i >> file2
     fi
else
echo $i >> file2
fi
done < file1

Hope it helps.

vonbiber · 10-25-2009, 06:47 AM

Quote:

Originally Posted by wakatana

input

Code:

BEGIN
some other text
containing words
such as BEGIN
and END
END
some other text
containing words
such as BEGIN
and END
BEGIN
some other text
containing words
such as BEGIN
and END
END

Another possible solution would be to
1. temporarily replace 'BEGIN' and 'END' that start a line
by a unique string
2. replace all other occurences of 'BEGIN' and 'END'
3. restore the original values of the replacements in 1.
e.g.

Code:

<your input> | sed 's?^BEGIN$?__@@@__?' | sed 's?^END$?__///__?' | \
    sed 's?BEGIN?START?g' | sed 's?END?EXIT?g' | \
    sed 's?__@@@__?BEGIN?' | sed 's?__///__?END?'

vikas027 · 10-25-2009, 06:52 AM

Quote:

Originally Posted by vonbiber

Another possible solution would be to
1. temporarily replace 'BEGIN' and 'END' that start a line
by a unique string
2. replace all other occurences of 'BEGIN' and 'END'
3. restore the original values of the replacements in 1.
e.g.

Code:

<your input> | sed 's?^BEGIN$?__@@@__?' | sed 's?^END$?__///__?' | \
    sed 's?BEGIN?START?g' | sed 's?END?EXIT?g' | \
    sed 's?__@@@__?BEGIN?' | sed 's?__///__?END?'

This is a nice solution, nice thinking. It did not clicked me. I coded all the way

druuna · 10-25-2009, 06:59 AM

Quote:

Originally Posted by wakatana

I am wondering to one thing, in example that you've posted you replace

s/ END/ EXIT/
and
s/ BEGIN/ START/

that works because of whitespace (space in this case) before matching string.
But is possible to exclude somehow the boundaries, from replacement ?
So I could rewrite it to

s/END/EXIT/
and
s/BEGIN/START/

and still be working.

No, not without doing extra work (vonbiber idea comes to mind).

I'm not sure why you need to get rid of the space, it seems to be the simplest solution.

wakatana · 10-25-2009, 07:38 AM

hi vikas027
Thank you for you reply, I am staring at your script (cause I am beginner in bash scripting) and cannot figure out what is $? in [ $? -eq 1 ] and [ $? -eq 0 ]
I know that -eq 1(0) is condition testing for equaling to 1(0) but what is $?

?

vonbiber: Interesting solution, thanks I had to a little change code to to what I want but preserves your idea

Code:

cat begendf | sed 's?^BEGIN$?__@@@__?' | sed 's?^END$?__]]]__?' |\
sed '/__@@@__/,/__]]]__/{s/BEGIN/START/;s/END/EXIT/}' |\
sed 's/__@@@__/BEGIN/g; s/__]]]__/END/'

wakatana · 10-25-2009, 07:42 AM

Quote:

Originally Posted by druuna

I'm not sure why you need to get rid of the space, it seems to be the simplest solution.

No I wont, just for interesting

And last question just for interesting

can i use another "delimiter" (or separator or what is correct name) in interval ?

/__@@@__/,/__]]]__/
tried
?__@@@__?,?__]]]__?
but did not work

vikas027 · 10-25-2009, 07:49 AM

Quote:

Originally Posted by wakatana

hi vikas027
Thank you for you reply, I am staring at your script (cause I am beginner in bash scripting) and cannot figure out what is $? in [ $? -eq 1 ] and [ $? -eq 0 ]
I know that -eq 1(0) is condition testing for equaling to 1(0) but what is $?

?

Hi,

Well $? stores the result of the last command you have run. If a command is successful it returns 0 else 1.

For .eg.

Code:

-sh-3.00$ rm abc
rm: cannot remove `abc': No such file or directory
-sh-3.00$ echo $?
1

Now,

Code:

-sh-3.00$ touch abc
-sh-3.00$ rm abc
-sh-3.00$ echo $?
0

Hope, this helps.

ghostdog74 · 10-25-2009, 08:57 AM

Using this sample input file, modified slightly for the first BEGIN,END block

Code:

$ more file
BEGIN
some other text
containing words
such as BEGIN
END and END END
END
some other text
containing words END
such as BEGIN
and END
BEGIN
some other text
containing words
such as BEGIN
and END
END

awk code:

Code:

awk '
/^END$/{f=0}
/^BEGIN$/{f=1;print;next}
f{ 
 gsub("BEGIN","START")
 gsub("END","EXIT")
 print $0
}f==0' file

output

Code:

$ ./shell.sh
BEGIN
some other text
containing words
such as START
EXIT and EXIT EXIT
END
some other text
containing words END
such as BEGIN
and END
BEGIN
some other text
containing words
such as START
and EXIT
END

NB: ALL sed solutions provided fails on this sample input.

ghostdog74 · 10-25-2009, 09:01 AM

Quote:

Originally Posted by vikas027

This is a nice solution, nice thinking.

no its not. Its ugly, unreadable and full of unnecessary chaining. Use sed for simple substitution.

druuna · 10-25-2009, 09:23 AM

@ghostdog74: You are not using the original posted input. Other solutions are indeed needed if the input is different, but that wasn't the OP's original question.......

Quote:

no its not. Its ugly, unreadable and full of unnecessary chaining. Use sed for simple substitution.

And what's wrong with the small and elegant sed solution (a one-liner, not a script) I posted in reply #2???

I'm talking about this one:

sed '/^BEGIN/,/^END/{s/ BEGIN/ START/;s/ END/ EXIT/}' infile

vikas027 · 10-25-2009, 09:23 AM

Quote:

Originally Posted by ghostdog74

no its not. Its ugly, unreadable and full of unnecessary chaining. Use sed for simple substitution.

Hi,

I just meant about these steps, not the code. I could not understand the code even.

Yes, this can be done by sed in just three lines as per my knowledge.

Code:

1. temporarily replace 'BEGIN' and 'END' that start a line
   by a unique string
2. replace all other occurences of 'BEGIN' and 'END'
3. restore the original values of the replacements in 1.

ghostdog74 · 10-25-2009, 09:43 AM

Quote:

Originally Posted by druuna

@ghostdog74: You are not using the original posted input. Other solutions are indeed needed if the input is different, but that wasn't the OP's original question.......

please look at the input again. it says "some other text containing words
such as BEGIN and END". there may be other ENDs and BEGINs at different places inside BEGIN,END blocks, besides the ones OP posted

Quote:

And what's wrong with the small and elegant sed solution (a one-liner, not a script) I posted in reply #2???

I'm talking about this one:

sed '/^BEGIN/,/^END/{s/ BEGIN/ START/;s/ END/ EXIT/}' infile

I am not talking about your code. Please see my reply again to Vikas. The sed solution he quoted is not provided by you.

Also, I did not say its wrong to use small elegant sed solution. I did say use sed for simple substitution. Anything beyond , like that of post
#5, code will get ugly and hard to decipher especially if anything went wrong.