SED, or GREP Command

konsolebox · 04-16-2013, 12:20 PM

But for millgates' version, a new line would be printed at the top since the hold space is always left with a newline if something is appended to it even if it's empty so I suggest doing something like this:

Code:

sed -n '/\/tmp/,/^$/{ /\/tmp/{ h; b; }; H; }; /^$/{x; /String1/p}'

millgates · 04-16-2013, 12:25 PM

Code:

sed -n '
    /\/tmp/,/^$/     # for all lines between /tmp an an empty line
        H;           # append current line to hold buffer
    /^$/ {           # if the line is empty (end of each "/tmp" -- "empty line" block)
        x;           # swap contents of hold buffer and pattern buffer
                     # this will, unfortunately, put the pattern buffer into hold space
                     # and produce an extra blank line
        /String1/p   # if the text that we've just put from hold space to pattern space
                     # contains String1, print it
    }
' <infile

Anyway, here is an awk solution. Not really complicated, not super-short either. Perhaps someone can find a better one:

Code:

awk 'BEGIN{RS="/tmp|\n\n"} /String1/ && RT~"\n" && $0="/tmp"$0' <infile

millgates · 04-16-2013, 12:32 PM

Quote:

Originally Posted by konsolebox

But for millgates' version, a new line would be printed at the top since the hold space is always left with a newline if something is appended to it even if it's empty so I suggest doing something like this:

Code:

sed -n '/\/tmp/,/^$/{ /\/tmp/{ h; b; }; H; }; /^$/{x; /String1/p}'

Good point. That's something I was trying to get rid of, too. Good solution with the {h; b}. Why didn't I think of that?

danielbmartin · 04-16-2013, 12:37 PM

Quote:

Originally Posted by millgates

... here is an awk solution. Not really complicated, not super-short either.

Code:

awk 'BEGIN{RS="/tmp|\n\n"} /String1/ && RT~"\n" && $0="/tmp"$0' <infile

Bravo!

Minor nitpick: OP wanted to keep the blank line which marks the end of each segment.

Daniel B. Martin

konsolebox · 04-16-2013, 12:41 PM

@danielbmartin If you would accept awk you should have said so as it's a lot easier

But anyway thanks to your post I was able to challenge myself into hacking sed, and that's the first time I was able to make use of the hold space.

The changes to sed was about the newline inserted at the beginning, and not at the end, just in case.

danielbmartin · 04-16-2013, 12:51 PM

Quote:

Originally Posted by konsolebox

@danielbmartin If you would accept awk you should have said so as it's a lot easier

Sure, I'll accept any working solution. I pounded the keyboard to build an awk solution, succeeded, and was about to post it when I read the post by millgates. His awk is far simpler than my if-then-else kludge. Did we all learn something? If so, let's break for lunch!

Daniel B. Martin

cortman · 04-16-2013, 12:54 PM

Quote:

Originally Posted by edwardcode

I tried that but that dose not search for string1. Also all of the lines start with /tmp so that would bring up the entire file.

P.S. The proper syntax for the command you want to run is:

sed -n '/\/tmp/,/^$/p' file_name

You need a coma not a period.

Doh! That was a typo. I thought it worked for me when I used the comma.
Misinterpreted your aim, sorry. Looks like the others have figured it out.

danielbmartin · 04-17-2013, 10:43 AM

This problem has been very well solved by several contributors.

As a learning exercise I created a different solution composed of multiple invocations of sed. It worked, and then I attempted to combine all the seds into one, and was unsuccessful. The purpose of this post is to ask "what did I do wrong?"

This is the InFile ...

Code:

bad line 1
bad line 2
bad line 3
bad line 4

/tmp
good line 1
good line 2
good line 3
String1
good line 4
good line 5
good line 6

bad line 5
bad line 6
bad line 7

bad line 1
bad line 2
bad line 3
bad line 4

/tmp
good line 11
good line 12
good line 13
String2
good line 14
good line 15
good line 16

bad line 15
bad line 16
bad line 17

... this is the desired OutFile...

Code:

/tmp
good line 11
good line 12
good line 13
String2
good line 14
good line 15
good line 16

This code works ...

Code:

echo; echo "Method #1 of LQ Member danielbmartin (using only sed)"
rm $OutFile
 sed ':a;N;$!ba;s/\n/~/g' $InFile  \
|sed 's/~\/tmp/\n\/tmp/g'          \
|sed '/String2/!d'                 \
|sed 's/\(.*\)\(~~.*\)/\1~/g'      \
|sed 's/~/\n/g'                    \
>$OutFile            
echo "OutFile ..."; cat $OutFile; echo "End Of File"

... and this derivative version works ...

Code:

echo; echo "Method #2 of LQ Member danielbmartin (using only sed)"
rm $OutFile
 sed ':a;N;$!ba;s/\n/~/g; s/~\/tmp/\n\/tmp/g' $InFile  \
|sed '/String2/!d; s/\(.*\)\(~~.*\)/\1~/g; s/~/\n/g' >$OutFile            
echo "OutFile ..."; cat $OutFile; echo "End Of File"

... but this derivative version fails!

Code:

echo; echo "Method #3 of LQ Member danielbmartin (using only sed)"
rm $OutFile
 sed ':a;N;$!ba;s/\n/~/g; s/~\/tmp/\n\/tmp/g; /String2/!d; s/\(.*\)\(~~.*\)/\1~/g; s/~/\n/g' $InFile >$OutFile            
echo "OutFile ..."; cat $OutFile; echo "End Of File"

Please explain...

Daniel B. Martin

millgates · 04-17-2013, 11:55 AM

Let's try a slightly minimized infile:

Code:

b
b

t
g
s
g
g

b
b

t
b
b

where "g" represents a "good line", "b" represents a "bad line", "t" is the "/tmp" and "s" is the string we're lookong for.

Now, your first example, if I understand correctly:

Code:

 sed ':a;N;$!ba;s/\n/~/g' $InFile # this reads the entire file to pattern space
                                  # and replaces newlines with ~
          |        # produces a single line
          v
b~b~~t~g~s~g~g~~b~b~~t~b~b~
          |
          v
 sed 's/~t/\nt/g'         # this puts a newline before every t in the string
          |               # so now we have one line for every t block
          v
b~b~                      # these lines are read ONE BY ONE by the following
t~g~s~g~g~~b~b~           # sed !!!
t~b~b~
          |
          V
 sed '/s/!d'        # this deletes all lines that don't contain s !!!
          |
          v
t~g~s~g~g~~b~b~
          |
          v
 sed 's/\(.*\)\(~~.*\)/\1~/g' # this cuts off everything from the last ~~ on.
          |                   # please note that this will break if there are more than one
          |                   # block not starting with t after the matching block, because 
          v                   # the first .* is greedy
t~g~s~g~g~
          |
          v
 sed 's/~/\n/g'   # this will replace ~ with \n so we get the block we wanted 
          |       # in the first place:
          v
t
g
s
g
g
g

Ok, now your last example:

Code:

:a;N;$!ba;s/\n/~/g; s/~t/\nt/g  # same as in the first
     |                          # example, results in following
     v
b~b~
t~g~s~g~g~~b~b~
t~b~b~
     |
     v
 /s/!d;   # this deletes the pattern space if it does not contain s
     |    # but the entire file is now in the pattern space!!
     V    # and it does contain s! So nothing gets deleted
b~b~
t~g~s~g~g~~b~b~
t~b~b~
     |
     v
 s/\(.*\)\(~~.*\)/\1~/g # this removes everything after
     |                            # the LAST ~~
     v
b~b~
t~g~s~g~g~
     |
     v
s/~/\n/g # replace ~ with \n again
     |
     v
b
b

t
g
s
g
g

Not tested, so I might have overlooked something somewhere.

danielbmartin · 04-18-2013, 06:37 AM

[QUOTE=millgates;4933329]

Code:

 /s/!d;   # this deletes the pattern space if it does not contain s
     |    # but the entire file is now in the pattern space!!
     V    # and it does contain s! So nothing gets deleted

Bingo! Thank you for the detailed analysis.

Daniel B. Martin