LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-18-2007, 03:51 AM   #1
xaverius
LQ Newbie
 
Registered: Aug 2007
Posts: 8

Rep: Reputation: 0
Sed/Awk: print lines between n'th and (n+1)'th match of "foo"


I have a textfile, which may for example look like this:
Code:
blaaat
foo bar
some text here
-o-
more text
more foo's
and even more bar
-o-
.<only a dot>
-o-
...
....
...
...
...
-o-
There's a "^-o-$" between every 'record'.
The user enters a number (call it $1), my script should use sed and/or awk to print out the x-th record in the file.

This is what I've got so far, it's only input-checking actually:
Code:
ubound=`cat $file | grep -c -e -o-`
test $1 -lt 1 -o $1 -gt $ubound 2>/dev/null
if [ $? = 0 ]; then
        echo "Warning: n should be in the range [1,$ubound]."
        exit 1
fi
Can anyone help me further on the sed/awk part please?
 
Old 08-18-2007, 04:13 AM   #2
slakmagik
Senior Member
 
Registered: Feb 2003
Distribution: Slackware
Posts: 4,113

Rep: Reputation: Disabled
awk -v RS='-o-' 'NR == '$1' { print }' input_file

Last edited by slakmagik; 08-18-2007 at 04:15 AM. Reason: s/foo/input_file/; a little clearer that way
 
Old 08-18-2007, 04:44 AM   #3
slakmagik
Senior Member
 
Registered: Feb 2003
Distribution: Slackware
Posts: 4,113

Rep: Reputation: Disabled
Quote:
Originally Posted by xaverius View Post
Code:
ubound=`cat $file | grep -c -e -o-`
test $1 -lt 1 -o $1 -gt $ubound 2>/dev/null
if [ $? = 0 ]; then
        echo "Warning: n should be in the range [1,$ubound]."
        exit 1
fi
Incidentally, why not
Code:
ubound=$(grep -c '^-o-$' $file)
if [[ $1 -lt 1 || $1 -gt $ubound ]]; then
        echo "Warning: n should be in the range [1,$ubound]."
        exit 1
fi
 
Old 08-18-2007, 05:07 AM   #4
xaverius
LQ Newbie
 
Registered: Aug 2007
Posts: 8

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by digiot View Post
Incidentally, why not
Code:
ubound=$(grep -c '^-o-$' $file)
if [[ $1 -lt 1 || $1 -gt $ubound ]]; then
        echo "Warning: n should be in the range [1,$ubound]."
        exit 1
fi
Because of the stderr-redirection: users might nog enter a real number, but rather something like "1q" or "foo"... doing it this way, the script will print a nice error message and just exit
edit2: I hadn't noticed the double [[/]] you used there... do they have the same effect?

edit: Why should I prefer your way of setting ubound above mine? (no offense )
I agree it looks better, but are there other advantages? I never actually learned about the $( )-construction, so...

Last edited by xaverius; 08-18-2007 at 05:12 AM.
 
Old 08-18-2007, 05:23 AM   #5
xaverius
LQ Newbie
 
Registered: Aug 2007
Posts: 8

Original Poster
Rep: Reputation: 0
Also, the solution you first advised does not seem to work correctly, here's a bash-session:
Code:
Macbook-2:~/ex_st xaverius$ cat getMessage 
#!/bin/bash

file=~/.forumpje.txt
EOF="-o-"

if [ $# != 1 ]; then
        echo "Syntax: $0 <nr>"
        exit 1
fi

ubound=`cat $file | grep -c -e -o-`
test $1 -lt 1 -o $1 -gt $ubound 2>/dev/null
if [ $? = 0 ]; then
        echo "Waarschuwing: de waarde van nr moet in het interval [1,$ubound] gelegen zijn."
        exit 1
fi

awk -v RS='-o-' 'NR == '$1' { print }' $file
Macbook-2:~/ex_st xaverius$ cat ~/.forumpje.txt 

me
niets
blaat nieuwe regel enzo done
Fri Aug 17 15:13:46 CEST 2007
-o-
ikke
veel over weinig
niets dus ;-)
Fri Aug 17 15:14:01 CEST 2007
-o-
Macbook-2:~/ex_st xaverius$ getMessage 1

me
niets
blaat nieuwe regel enzo done
Fri Aug 17 15:13:46 CEST 2007

Macbook-2:~/ex_st xaverius$ getMessage 2
o
Macbook-2:~/ex_st xaverius$
Check out the last command, doesn't seem correct imo...

Last edited by xaverius; 08-18-2007 at 05:25 AM.
 
Old 08-18-2007, 06:16 AM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

I just tried (copy->paste) the test session in your previous post (#5), and it works.

I did notice something else: Your prompt says: Macbook-2. Are you trying this on Apple's OS X?
If so, check to see which awk is actually used and if it is posix compliant. Maybe you can use nawk instead.

Hope this helps.
 
Old 08-18-2007, 07:01 AM   #7
xaverius
LQ Newbie
 
Registered: Aug 2007
Posts: 8

Original Poster
Rep: Reputation: 0
I have no idea which version of awk OSX is using, but I had access to another machine:
Code:
$ uname -a
SunOS <hostname> 5.8 Generic_108528-07 sun4u sparc SUNW,Ultra-4
The code runs smoothly there, and since this is the machine it's supposed to run on later, it's ok now
Thx!

Btw: still interested in a way to solve it using sed though
 
Old 08-18-2007, 03:55 PM   #8
slakmagik
Senior Member
 
Registered: Feb 2003
Distribution: Slackware
Posts: 4,113

Rep: Reputation: Disabled
Quote:
Originally Posted by xaverius View Post
I have no idea which version of awk OSX is using
You can get gawk for Macs but it may not be the default and awks other than gawk probably don't accept multichar regexes for RS.

Quote:
Originally Posted by xaverius View Post
Because of the stderr-redirection: users might nog enter a real number, but rather something like "1q" or "foo"... doing it this way, the script will print a nice error message and just exit
Well, this is bash-specific, so maybe you're better off the way you had it. It's just that testing the return code with a test, when that's what test *does* kind of bothers me.
Code:
if [[ ! $1 =~ ^[0-9]+$ || $1 -lt 1 || $1 -gt $ubound ]]; then
        echo "Warning: n should be in the range [1,$ubound]."
        exit 1
fi
This will require one or more integers (and only integers) for the argument. || and -o both mean 'or', but work differently (or fail to work) with the test/[ builtin and the [[ command. And I just find || more readable than -o in general. The $(...) syntax is about the same as the `...` syntax, except that it nests better and is regarded as the more 'modern' way (though I still type `...` in interactive shells where possible). This is just me, though. The only important things were the removal of a useless cat in the assignment to 'ubound' and the combining two tests into one. I also tightened up the regex for grep (and I should have made it tighter for the (g)awk) (by anchoring it).

Quote:
Originally Posted by xaverius View Post
edit: Why should I prefer your way of setting ubound above mine? (no offense )
Anyway, no offense taken at all. My attitude is that, as long as it works, there's no reason to prefer one way over another. It's just that some things work in certain corner cases and fail in others, some things are microscopically more efficient than others (which can add up in certain scenarios), some things just look or feel better than others, etc. It's largely just a matter of taste, and that's why I asked 'why not' rather than said 'you should have' Your point about input validation is a good one, when I wasn't initially thinking of even testing for non-integer input.

Quote:
Originally Posted by xaverius View Post
Btw: still interested in a way to solve it using sed though
It's probably doable, but (g)awk is much better designed for this sort of task and using sed to that extent makes my head hurt.
 
Old 08-19-2007, 05:12 AM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by xaverius View Post
Also, the solution you first advised does not seem to work correctly,
that's because the awk has a syntax error. The correct one should look something like this

Code:
# var=1
# awk -v input=$var 'BEGIN{RS="-o-"}NR==input{ print  }' file
blaaat
foo bar
some text here
 
Old 08-19-2007, 01:56 PM   #10
slakmagik
Senior Member
 
Registered: Feb 2003
Distribution: Slackware
Posts: 4,113

Rep: Reputation: Disabled
Quote:
Originally Posted by ghostdog74 View Post
that's because the awk has a syntax error. The correct one should look something like this

Code:
# var=1
# awk -v input=$var 'BEGIN{RS="-o-"}NR==input{ print  }' file
blaaat
foo bar
some text here
I've been completely brain-damaged lately but, still, if you're referring to this:
Code:
awk -v RS='-o-' 'NR == '$1' { print }' input_file
there is no syntax error and it works under both gawk and mawk (and surprisingly, to me, under original awk) and apparently under SunOS as well as Linux - dunno why it breaks on OSX, but it's OSX-specific (or 'whatever-OSX's-awk-is'-specific).
Code:
:cat foo.sh
ubound=$(grep -c '^-o-$' $2)
if [[ ! $1 =~ ^[0-9]+$ || $1 -lt 1 || $1 -gt $ubound ]]; then
        echo "Warning: n should be in the range [1,$ubound]."
        exit 1
fi
awk -v RS='-o-' 'NR == '$1' { print }' $2

:sh foo.sh 5q foo
Warning: n should be in the range [1,4].

:sh foo.sh 1 foo
blaaat
foo bar
some text here
 
Old 08-19-2007, 06:09 PM   #11
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by digiot View Post
I've been completely brain-damaged lately but, still, if you're referring to this:
Code:
awk -v RS='-o-' 'NR == '$1' { print }' input_file
there is no syntax error and it works under both gawk and mawk ..
Code:
# more file
blaaat
foo bar
some text here
-o-
more text
more foo's
and even more bar
-o-
.<only a dot>
-o-
# awk -v RS='-o-' 'NR == '$1' { print }' file
awk: NR ==  { print }
awk:        ^ syntax error
# awk --version
GNU Awk 3.1.5
Copyright (C) 1989, 1991-2005 Free Software Foundation.
 
Old 08-19-2007, 06:19 PM   #12
slakmagik
Senior Member
 
Registered: Feb 2003
Distribution: Slackware
Posts: 4,113

Rep: Reputation: Disabled
So give $1 a value by passing in an argument.
 
Old 08-19-2007, 08:18 PM   #13
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by digiot View Post
So give $1 a value by passing in an argument.
my bad. thought it was awk's $1
 
Old 08-19-2007, 09:15 PM   #14
slakmagik
Senior Member
 
Registered: Feb 2003
Distribution: Slackware
Posts: 4,113

Rep: Reputation: Disabled
Ah, yeah, that's easy to do. One of the most bothersome things about mixing shell and awk.
 
Old 08-20-2007, 04:41 AM   #15
xaverius
LQ Newbie
 
Registered: Aug 2007
Posts: 8

Original Poster
Rep: Reputation: 0
This is what the final script looks like:
Code:
#!/bin/bash

file=~/.forumpje.txt
EOF="-o-"

if [ $# != 1 ]; then
        echo "Syntax: $0 <nr>"
        exit 1
fi

ubound=`cat $file | grep -c -e -o-`
test $1 -lt 1 -o $1 -gt $ubound 2>/dev/null
if [ $? = 0 ]; then
        echo "Waarschuwing: de waarde van nr moet in het interval [1,$ubound] gelegen zijn."
        exit 1
fi

awk -v RS="$EOF" 'NR == '$1' { print }' $file
I think I don't really understand the '$1'-part, but it works, so that's okay

It's tested on GNU awk 3.0.4

Can someone explain the awk rule please?
If the current record = $1, then print this line - but what is the value of $1? Is it supposed to be the bash $1, or the awk $1? It's not really clear to me... The -v flag is used to set the recordseperator (RS) I guess...
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
grep/sed/awk - find match, then match on next line gctaylor1 Programming 3 07-11-2007 08:55 AM
Replacing "function(x)" with "x" using sed/awk/smth Griffon26 Linux - General 3 11-22-2006 10:47 AM
find awk sed.. something along these lines citrus Linux - General 1 08-21-2006 03:04 PM
awk/gawk/sed - read lines from file1, comment out or delete matching lines in file2 rascal84 Linux - General 1 05-24-2006 09:19 AM
awk print lines that doesn't have a pattern huynguye Programming 5 05-04-2006 11:08 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:13 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration