LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   SED interval specification (https://www.linuxquestions.org/questions/programming-9/sed-interval-specification-764317/)

wakatana 10-25-2009 05:37 AM

SED interval specification
 
Hi all

input
Code:

BEGIN
some other text
containing words
such as BEGIN
and END
END
some other text
containing words
such as BEGIN
and END
BEGIN
some other text
containing words
such as BEGIN
and END
END


desired output
Code:


BEGIN
some other text
containing words
such as START
and EXIT
END
some other text
containing words
such as BEGIN
and END
BEGIN
some other text
containing words
such as START
and EXIT
END

In conclusion replace all BEGIN with START and END with EXIT but only within BEGIN END boundaries.

I tried that
Code:

cat begendf | sed '/^BEGIN/,/^END/ s/BEGIN/START/; s/END/EXIT/'
START
some other text
containing words
such as START
and EXIT
EXIT
some other text
containing words
such as BEGIN
and EXIT
START
some other text
containing words
such as START
and EXIT
EXIT

But as you can see also replaced boundaries (BEGIN, END to START, EXIT) and leaves unchanged only BEGIN and not END outside the boundaries.
Also tried some weird tricks using new spaces but neither working. Thanks a lot.

druuna 10-25-2009 06:22 AM

Hi,

This: s/BEGIN/START/; s/END/EXIT/, will find all (!!) BEGIN and END lines, including the ones in the range.

The scond part (all after the ; ) is not a part of the range specified in the first part, try this:

sed -e '/^BEGIN/,/^END/s/ BEGIN/ START/' -e '/^BEGIN/,/^END/s/ END/ EXIT/' infile

2 sed actions (made possible by the -e options), first one for the BEGIN/END part, second for the END/EXIT part. Both are bound to a specific section.

You could also rewrite it to:

sed '/^BEGIN/,/^END/{s/ BEGIN/ START/;s/ END/ EXIT/}' infile

All between the curly brackets ({...}) is bound by the section part.

Hope this helps.

wakatana 10-25-2009 06:43 AM

druuna thanks a lot this really helps,
I saw lot of sed examples including -e option also read man but it was still unclear what is it for, if it works without it :)
Now I know I should use it if I have multiple actions.
I am wondering to one thing, in example that you've posted you replace

s/ END/ EXIT/
and
s/ BEGIN/ START/

that works because of whitespace (space in this case) before matching string.
But is possible to exclude somehow the boundaries, from replacement ?
So I could rewrite it to


s/END/EXIT/
and
s/BEGIN/START/


and still be working.

Thanks

vikas027 10-25-2009 06:45 AM

I have done this through a dirty way :D. I am calling this dirty as this could have been done through sed/awk one liners like druuna has done, BUT my knowledge is limited to sed/awk.

Anyways, here is my script. I am assuming your file is named as file.
And your desired output would be in file2.

Code:

> /tmp/file1
while read i
do
echo $i | grep ^BEGIN
if [ $? -eq 1 ]
then
  echo $i | grep BEGIN
    if [ $? -eq 0 ]
      then
      echo $i | sed 's/BEGIN/START/' >> file1
    else
      echo $i >> file1
    fi
else
echo $i >> file1
fi
done < file


> /tmp/file2
while read i
do
echo $i | grep ^END
if [ $? -eq 1 ]
then
  echo $i | grep END
    if [ $? -eq 0 ]
      then
      echo $i | sed 's/END/EXIT/' >> file2
    else
      echo $i >> file2
    fi
else
echo $i >> file2
fi
done < file1

Hope it helps.

vonbiber 10-25-2009 06:47 AM

Quote:

Originally Posted by wakatana (Post 3731693)
input
Code:

BEGIN
some other text
containing words
such as BEGIN
and END
END
some other text
containing words
such as BEGIN
and END
BEGIN
some other text
containing words
such as BEGIN
and END
END


Another possible solution would be to
1. temporarily replace 'BEGIN' and 'END' that start a line
by a unique string
2. replace all other occurences of 'BEGIN' and 'END'
3. restore the original values of the replacements in 1.
e.g.
Code:

<your input> | sed 's?^BEGIN$?__@@@__?' | sed 's?^END$?__///__?' | \
    sed 's?BEGIN?START?g' | sed 's?END?EXIT?g' | \
    sed 's?__@@@__?BEGIN?' | sed 's?__///__?END?'


vikas027 10-25-2009 06:52 AM

Quote:

Originally Posted by vonbiber (Post 3731725)
Another possible solution would be to
1. temporarily replace 'BEGIN' and 'END' that start a line
by a unique string
2. replace all other occurences of 'BEGIN' and 'END'
3. restore the original values of the replacements in 1.
e.g.
Code:

<your input> | sed 's?^BEGIN$?__@@@__?' | sed 's?^END$?__///__?' | \
    sed 's?BEGIN?START?g' | sed 's?END?EXIT?g' | \
    sed 's?__@@@__?BEGIN?' | sed 's?__///__?END?'


This is a nice solution, nice thinking. It did not clicked me. I coded all the way :banghead:

druuna 10-25-2009 06:59 AM

Quote:

Originally Posted by wakatana (Post 3731722)
I am wondering to one thing, in example that you've posted you replace

s/ END/ EXIT/
and
s/ BEGIN/ START/

that works because of whitespace (space in this case) before matching string.
But is possible to exclude somehow the boundaries, from replacement ?
So I could rewrite it to


s/END/EXIT/
and
s/BEGIN/START/


and still be working.

No, not without doing extra work (vonbiber idea comes to mind).

I'm not sure why you need to get rid of the space, it seems to be the simplest solution.

wakatana 10-25-2009 07:38 AM

hi vikas027
Thank you for you reply, I am staring at your script (cause I am beginner in bash scripting) and cannot figure out what is $? in [ $? -eq 1 ] and [ $? -eq 0 ]
I know that -eq 1(0) is condition testing for equaling to 1(0) but what is $? :) ?

vonbiber: Interesting solution, thanks I had to a little change code to to what I want but preserves your idea ;)

Code:

cat begendf | sed 's?^BEGIN$?__@@@__?' | sed 's?^END$?__]]]__?' |\
sed '/__@@@__/,/__]]]__/{s/BEGIN/START/;s/END/EXIT/}' |\
sed 's/__@@@__/BEGIN/g; s/__]]]__/END/'


wakatana 10-25-2009 07:42 AM

Quote:

Originally Posted by druuna (Post 3731736)
I'm not sure why you need to get rid of the space, it seems to be the simplest solution.

No I wont, just for interesting :)
And last question just for interesting :)

can i use another "delimiter" (or separator or what is correct name) in interval ?

/__@@@__/,/__]]]__/
tried
?__@@@__?,?__]]]__?
but did not work

vikas027 10-25-2009 07:49 AM

Quote:

Originally Posted by wakatana (Post 3731766)
hi vikas027
Thank you for you reply, I am staring at your script (cause I am beginner in bash scripting) and cannot figure out what is $? in [ $? -eq 1 ] and [ $? -eq 0 ]
I know that -eq 1(0) is condition testing for equaling to 1(0) but what is $? :) ?

Hi,

Well $? stores the result of the last command you have run. If a command is successful it returns 0 else 1.

For .eg.

Code:

-sh-3.00$ rm abc
rm: cannot remove `abc': No such file or directory
-sh-3.00$ echo $?
1

Now,
Code:

-sh-3.00$ touch abc
-sh-3.00$ rm abc
-sh-3.00$ echo $?
0

Hope, this helps.

ghostdog74 10-25-2009 08:57 AM

Using this sample input file, modified slightly for the first BEGIN,END block
Code:

$ more file
BEGIN
some other text
containing words
such as BEGIN
END and END END
END
some other text
containing words END
such as BEGIN
and END
BEGIN
some other text
containing words
such as BEGIN
and END
END

awk code:
Code:

awk '
/^END$/{f=0}
/^BEGIN$/{f=1;print;next}
f{
 gsub("BEGIN","START")
 gsub("END","EXIT")
 print $0
}f==0' file

output
Code:

$ ./shell.sh
BEGIN
some other text
containing words
such as START
EXIT and EXIT EXIT
END
some other text
containing words END
such as BEGIN
and END
BEGIN
some other text
containing words
such as START
and EXIT
END

NB: ALL sed solutions provided fails on this sample input.

ghostdog74 10-25-2009 09:01 AM

Quote:

Originally Posted by vikas027 (Post 3731730)
This is a nice solution, nice thinking.


no its not. Its ugly, unreadable and full of unnecessary chaining. Use sed for simple substitution.

druuna 10-25-2009 09:23 AM

@ghostdog74: You are not using the original posted input. Other solutions are indeed needed if the input is different, but that wasn't the OP's original question.......

Quote:

no its not. Its ugly, unreadable and full of unnecessary chaining. Use sed for simple substitution.
And what's wrong with the small and elegant sed solution (a one-liner, not a script) I posted in reply #2???

I'm talking about this one:

sed '/^BEGIN/,/^END/{s/ BEGIN/ START/;s/ END/ EXIT/}' infile

vikas027 10-25-2009 09:23 AM

Quote:

Originally Posted by ghostdog74 (Post 3731815)
no its not. Its ugly, unreadable and full of unnecessary chaining. Use sed for simple substitution.

Hi,

I just meant about these steps, not the code. I could not understand the code even. ;)
Yes, this can be done by sed in just three lines as per my knowledge.
Code:

1. temporarily replace 'BEGIN' and 'END' that start a line
  by a unique string
2. replace all other occurences of 'BEGIN' and 'END'
3. restore the original values of the replacements in 1.


ghostdog74 10-25-2009 09:43 AM

Quote:

Originally Posted by druuna (Post 3731830)
@ghostdog74: You are not using the original posted input. Other solutions are indeed needed if the input is different, but that wasn't the OP's original question.......

please look at the input again. it says "some other text containing words
such as BEGIN and END". there may be other ENDs and BEGINs at different places inside BEGIN,END blocks, besides the ones OP posted


Quote:

And what's wrong with the small and elegant sed solution (a one-liner, not a script) I posted in reply #2???

I'm talking about this one:

sed '/^BEGIN/,/^END/{s/ BEGIN/ START/;s/ END/ EXIT/}' infile
I am not talking about your code. Please see my reply again to Vikas. The sed solution he quoted is not provided by you.

Also, I did not say its wrong to use small elegant sed solution. I did say use sed for simple substitution. Anything beyond , like that of post
#5, code will get ugly and hard to decipher especially if anything went wrong.


All times are GMT -5. The time now is 02:30 PM.