LinuxQuestions.org - bash - awk, sed, grep, ... advice

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - bash - awk, sed, grep, ... advice (https://www.linuxquestions.org/questions/programming-9/bash-awk-sed-grep-advice-664406/)

bash - awk, sed, grep, ... advice

[aix]
hi, i have a file like:

Code:

=====

hello

world

file-id:200

aller

spica

=====

l33t

file-id:500

hax0rz

=====

chun-li

akuma

gouki

file-id:42

ken

ryu

sakura

morrigan

dark-sakura

i'd like to print the lines in between the '====='s.
for example: for file-id:500, i want the output

Code:

=====

l33t

file-id:500

hax0rz

=====

there are a variable amount of lines between '====='s.

is there a simple way of doing it ?

How big is the input file? Will it easily fit in memory?

not terribly big:

Code:

 wc -l file.in

    5321

 wc -c file.in

  192741

Code:

#!/bin/bash

awk 'BEGIN {RS="\n=====\n"} /'$1'/ {print}'

Call the script with file-id:xyz as the argument.

Dave

thanks, this doesnt quite work for me:

Code:

schneidz@lq> cat ilikejam.lst | ilikejam.ksh 500

file-id:500

any other suggestions. i'll man up on awk so i can figure out how to print everything between record separators, not just the line that matches.

regards,

Here is a quick and dirty solution (probably someone will find a better one), but it seems to work:

Code:

allez@home:~/tmp> cat file.in

=====

hello

world

file-id:200

aller

spica

=====

l33t

file-id:500

hax0rz

=====

chun-li

akuma

gouki

file-id:42

ken

ryu

sakura

morrigan

dark-sakura



allez@home:~/tmp> cat search.sh

#!/bin/sh



if [ "$#" != "2" ];

then

  echo "Usage: $0 <file> <string_to_search>"

  exit 1

fi



search_file="$1"

search_string="$2"

cat "$1" | tr "\n" "%" | sed 's/\=\%/\n/g' | grep "$2" | tr "%" "\n" | grep -v "^=*$"



allez@home:~/tmp> ./search.sh file.in "file-id:500"

l33t

file-id:500

hax0rz



allez@home:~/tmp> ./search.sh file.in "ryu"

chun-li

akuma

gouki

file-id:42

ken

ryu

sakura

morrigan

dark-sakura

Code:

awk 'BEGIN {RS="===="} /file-id:500/' file

Quote:

Originally Posted by ghostdog74 (Post 3255633)

Code:

awk 'BEGIN {RS="===="} /file-id:500/' file

i tested ghostdog's and it works quite nicely, thanks.

my only peculiarity i have is:

Code:

schneidz@lq> h=42

schneidz@lq> echo $h

42

schneidz@lq> awk -v h=$h 'BEGIN {RS="===="} /h/' ilikejam.lst



hello

world

file-id:200

aller

spica





l33t

file-id:500

hax0rz





chun-li

akuma

gouki

file-id:42

ken

ryu

sakura

morrigan

dark-sakura

still fine-tuning. i'll re-post with my progress.
regards,

although slightly more complex a tweaked allez example allows for variable substitution:

Code:

schneidz@lq> cat allez.ksh

#!/bin/sh



if [ "$#" != "2" ];

then

  echo "Usage: $0 <file> <string_to_search>"

  exit 1

fi



search_file="$1"

search_string="$2"

cat "$1" | tr "\n" "%" | tr "=%" "\n" | grep "$2" | tr "%" "\n" | grep -v "^=*$"





schneidz@lq> allez.ksh ilikejam.lst $h

chun-li

akuma

gouki

file-id:42

ken

ryu

sakura

morrigan

dark-sakura

thanks for everyone's help.
__________
edit:
and then maybe there's this:

Code:

grep -p=====  $h ilikejam.lst

chun-li

akuma

gouki

file-id:42

ken

ryu

sakura

morrigan

dark-sakura

Quote:

Originally Posted by schneidz (Post 3255994)

42
schneidz@lq> awk -v h=$h 'BEGIN {RS="===="} /h/' ilikejam.lst

/h/ means match a "h". It does not mean match the value of the variable h. In order to do that, change /h/ to

Code:

$0 ~ h

When faced with a problem like this, I immediately consider that awk was designed specifically to handle moderately-complex file processing jobs like this one; and that the entire perl programming language was, in one sense, "built on top of 'awk.'" Therefore, I have two heavyweight programming tools at my disposal.

It's important to consider solutions that are descriptive of the problem being solved: not merely "something that 'works now.'" It needs to work well in the general case.

The Perl community has a saying: "TMTOWTDI" = "There's More Than One Way To Do It." And that's very true.

Sometimes a good solution to a problem can be discovered by re-stating it, thusly:

Quote:

"Print all of the lines in the source-file that are not equal to '====='."

If that is a valid definition of your problem, then grep could be used to "print all lines in this file which do not match the following regular-expression ('string pattern')."

Quote:

grep -v /^=====$/ filename

(The regular-expression uses the "^" and "$" anchors, which mark "start of line" and "end of line" respectively, to match lines that consist of five equals-signs. The "-v" command-line option inverts the test to print all non-matching lines.)

"TMTOWTDI!" "TMTOWTDI!" More than one 'solution' will 'work!' The one that you're looking for is the one that is easiest to implement and that solves the problem most-completely in the general case.

I have seen this thread on the first day of the posting itself. I thought solution exists using sed itself. ofcourse I am learning awk. I have to improve my skills in writing awk program. using awk may be better too.

but I have a solution using sed.

Code:

sed -n '/500/ {;H;b one;};/=====/ {;h;b;};H;b;:one;n;/=====/ {;g;p;b;};H;b one;' <inputfile>

I was not able to give time to this question. Today I did it. hope it will be useful to OP and others ofcourse :cool:.

first line will be ===== .
To optimize the code, I allowed it to be there in the output.....

Funky Perl:

Code:

perl -e '$/="====="; map { s/^$//; printf "=====%s\n", $_ if (/file-id:500/); } <>' in_file

Hi.

Windowing requirements seem to occur frequently. I often use cgrep for such tasks:

Code:

#!/bin/bash -



# @(#) s1      Demonstrate windowing feature of cgrep.

# http://www.bell-labs.com/project/wwexptools/cgrep/



echo

echo "(Versions displayed with local utility \"version\")"

version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) cgrep

set -o nounset

echo



FILE=data1

pattern=${1-"file-id:500"}



echo " Results:"

cgrep -D -+w '^===' "$pattern" $FILE



exit 0

Producing:

Code:

$ ./s1



(Versions displayed with local utility "version")

Linux 2.6.11-x1

GNU bash 2.05b.0

cgrep (local) - no version provided for ~/executable/cgrep.



 Results:

=====

l33t

file-id:500

hax0rz

=====

See the URL for a source download. The cgrep utility as many more features than this, and, since you have the source, you can place it on other systems you use. On the one I used here, I have it installed in a private directory ... cheers, makyo