LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 08-21-2008, 02:52 PM   #1
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,213

Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
bash - awk, sed, grep, ... advice


[aix]
hi, i have a file like:
Code:
=====
hello
world
file-id:200
aller
spica
=====
l33t
file-id:500
hax0rz
=====
chun-li
akuma
gouki
file-id:42
ken
ryu
sakura
morrigan
dark-sakura
i'd like to print the lines in between the '====='s.
for example: for file-id:500, i want the output
Code:
=====
l33t
file-id:500
hax0rz
=====
there are a variable amount of lines between '====='s.

is there a simple way of doing it ?
 
Old 08-21-2008, 03:35 PM   #2
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 63
How big is the input file? Will it easily fit in memory?
 
Old 08-21-2008, 04:04 PM   #3
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,213

Original Poster
Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
not terribly big:
Code:
 wc -l file.in
    5321
 wc -c file.in
  192741
 
Old 08-21-2008, 04:41 PM   #4
ilikejam
Senior Member
 
Registered: Aug 2003
Location: Glasgow
Distribution: Fedora / Solaris
Posts: 3,109

Rep: Reputation: 96
Code:
#!/bin/bash
awk 'BEGIN {RS="\n=====\n"} /'$1'/ {print}'
Call the script with file-id:xyz as the argument.

Dave

Last edited by ilikejam; 08-21-2008 at 04:42 PM.
 
1 members found this post helpful.
Old 08-21-2008, 06:12 PM   #5
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,213

Original Poster
Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
thanks, this doesnt quite work for me:
Code:
schneidz@lq> cat ilikejam.lst | ilikejam.ksh 500
file-id:500
any other suggestions. i'll man up on awk so i can figure out how to print everything between record separators, not just the line that matches.

regards,
 
Old 08-22-2008, 12:15 AM   #6
allez
Member
 
Registered: Jul 2008
Location: Russia/Siberia/Krasnoyarsk
Distribution: SuSE, CentOS, FreeBSD
Posts: 77

Rep: Reputation: 21
Here is a quick and dirty solution (probably someone will find a better one), but it seems to work:
Code:
allez@home:~/tmp> cat file.in
=====
hello
world
file-id:200
aller
spica
=====
l33t
file-id:500
hax0rz
=====
chun-li
akuma
gouki
file-id:42
ken
ryu
sakura
morrigan
dark-sakura

allez@home:~/tmp> cat search.sh
#!/bin/sh

if [ "$#" != "2" ];
then
  echo "Usage: $0 <file> <string_to_search>"
  exit 1
fi

search_file="$1"
search_string="$2"
cat "$1" | tr "\n" "%" | sed 's/\=\%/\n/g' | grep "$2" | tr "%" "\n" | grep -v "^=*$"

allez@home:~/tmp> ./search.sh file.in "file-id:500"
l33t
file-id:500
hax0rz

allez@home:~/tmp> ./search.sh file.in "ryu"
chun-li
akuma
gouki
file-id:42
ken
ryu
sakura
morrigan
dark-sakura
 
1 members found this post helpful.
Old 08-22-2008, 01:03 AM   #7
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Code:
awk 'BEGIN {RS="===="} /file-id:500/' file
 
1 members found this post helpful.
Old 08-22-2008, 10:02 AM   #8
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,213

Original Poster
Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
Quote:
Originally Posted by ghostdog74 View Post
Code:
awk 'BEGIN {RS="===="} /file-id:500/' file
i tested ghostdog's and it works quite nicely, thanks.

my only peculiarity i have is:
Code:
schneidz@lq> h=42
schneidz@lq> echo $h
42
schneidz@lq> awk -v h=$h 'BEGIN {RS="===="} /h/' ilikejam.lst

hello
world
file-id:200
aller
spica


l33t
file-id:500
hax0rz


chun-li
akuma
gouki
file-id:42
ken
ryu
sakura
morrigan
dark-sakura
still fine-tuning. i'll re-post with my progress.
regards,
 
Old 08-22-2008, 11:04 AM   #9
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 4,213

Original Poster
Rep: Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643Reputation: 643
although slightly more complex a tweaked allez example allows for variable substitution:
Code:
schneidz@lq> cat allez.ksh
#!/bin/sh

if [ "$#" != "2" ];
then
  echo "Usage: $0 <file> <string_to_search>"
  exit 1
fi

search_file="$1"
search_string="$2"
cat "$1" | tr "\n" "%" | tr "=%" "\n" | grep "$2" | tr "%" "\n" | grep -v "^=*$"


schneidz@lq> allez.ksh ilikejam.lst $h
chun-li
akuma
gouki
file-id:42
ken
ryu
sakura
morrigan
dark-sakura
thanks for everyone's help.
__________
edit:
and then maybe there's this:
Code:
grep -p=====  $h ilikejam.lst
chun-li
akuma
gouki
file-id:42
ken
ryu
sakura
morrigan
dark-sakura

Last edited by schneidz; 08-22-2008 at 12:17 PM. Reason: my stupidity
 
Old 08-22-2008, 12:33 PM   #10
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by schneidz View Post
42
schneidz@lq> awk -v h=$h 'BEGIN {RS="===="} /h/' ilikejam.lst
/h/ means match a "h". It does not mean match the value of the variable h. In order to do that, change /h/ to
Code:
$0 ~ h
 
Old 08-22-2008, 01:04 PM   #11
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,455

Rep: Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172Reputation: 1172
When faced with a problem like this, I immediately consider that awk was designed specifically to handle moderately-complex file processing jobs like this one; and that the entire perl programming language was, in one sense, "built on top of 'awk.'" Therefore, I have two heavyweight programming tools at my disposal.

It's important to consider solutions that are descriptive of the problem being solved: not merely "something that 'works now.'" It needs to work well in the general case.

The Perl community has a saying: "TMTOWTDI" = "There's More Than One Way To Do It." And that's very true.

Sometimes a good solution to a problem can be discovered by re-stating it, thusly:
Quote:
"Print all of the lines in the source-file that are not equal to '====='."
If that is a valid definition of your problem, then grep could be used to "print all lines in this file which do not match the following regular-expression ('string pattern')."
Quote:
grep -v /^=====$/ filename
(The regular-expression uses the "^" and "$" anchors, which mark "start of line" and "end of line" respectively, to match lines that consist of five equals-signs. The "-v" command-line option inverts the test to print all non-matching lines.)

"TMTOWTDI!" "TMTOWTDI!" More than one 'solution' will 'work!' The one that you're looking for is the one that is easiest to implement and that solves the problem most-completely in the general case.
 
Old 08-25-2008, 02:47 AM   #12
chakka.lokesh
Member
 
Registered: Mar 2008
Distribution: Ubuntu
Posts: 211

Rep: Reputation: 32
I have seen this thread on the first day of the posting itself. I thought solution exists using sed itself. ofcourse I am learning awk. I have to improve my skills in writing awk program. using awk may be better too.

but I have a solution using sed.

Code:
sed -n '/500/ {;H;b one;};/=====/ {;h;b;};H;b;:one;n;/=====/ {;g;p;b;};H;b one;' <inputfile>

I was not able to give time to this question. Today I did it. hope it will be useful to OP and others ofcourse .

first line will be ===== .
To optimize the code, I allowed it to be there in the output.....
 
Old 08-25-2008, 09:00 AM   #13
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 63
Funky Perl:
Code:
perl -e '$/="====="; map { s/^$//; printf "=====%s\n", $_ if (/file-id:500/); } <>' in_file
 
Old 08-25-2008, 10:30 AM   #14
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 719

Rep: Reputation: 72
Hi.

Windowing requirements seem to occur frequently. I often use cgrep for such tasks:
Code:
#!/bin/bash -

# @(#) s1       Demonstrate windowing feature of cgrep.
# http://www.bell-labs.com/project/wwexptools/cgrep/

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) cgrep
set -o nounset
echo

FILE=data1
pattern=${1-"file-id:500"}

echo " Results:"
cgrep -D -+w '^===' "$pattern" $FILE

exit 0
Producing:
Code:
$ ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0
cgrep (local) - no version provided for ~/executable/cgrep.

 Results:
=====
l33t
file-id:500
hax0rz
=====
See the URL for a source download. The cgrep utility as many more features than this, and, since you have the source, you can place it on other systems you use. On the one I used here, I have it installed in a private directory ... cheers, makyo
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sed, Awk, grep,Search,delete joyds219 Linux - Newbie 6 04-03-2008 07:15 AM
awk/sed to grep the text ahpin Linux - Software 3 10-17-2007 01:34 AM
diffrence between grep, sed, awk and egrep Fond_of_Opensource Linux - Newbie 3 08-18-2006 09:15 AM
bash script with grep and sed: sed getting filenames from grep odysseus.lost Programming 1 07-17-2006 12:36 PM
How can I awk/sed/grep the IPs from the maillog? abefroman Programming 7 03-09-2006 11:22 AM


All times are GMT -5. The time now is 01:51 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration