LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   awk regex /f[eo]{2}t/ does not work as advertised (https://www.linuxquestions.org/questions/programming-9/awk-regex-f%5Beo%5D%7B2%7Dt-does-not-work-as-advertised-821787/)

Telengard 07-23-2010 07:42 PM

awk regex /f[eo]{2}t/ does not work as advertised
 
Please refer to this article.

Quote:

Specific Repetition

If a regular expression is to be matched a particular number of times, curly brackets ({}) can be used. For example:

/f[eo]{2}t/

This matches "foot" or "feet".
Code:

$ awk '/f[eo]{2}t/'
feet
foot
$ ls -l $( which awk )
lrwxrwxrwx 1 root root 21 2009-04-02 11:12 /usr/bin/awk -> /etc/alternatives/awk
$ ls -l /etc/alternatives/awk
lrwxrwxrwx 1 root root 13 2009-04-02 11:12 /etc/alternatives/awk -> /usr/bin/mawk
$ lsb_release -dc
Description:    Ubuntu 8.04.4 LTS
Codename:      hardy

man awk gives the the manpage for MAWK Version 1.2, and I can not find any reference to the {n} construction therein.

What am I to make of this?

ghostdog74 07-23-2010 07:46 PM

if you use gawk, use the --re-interval or --posix option to enable interval expressions. (ie {}).

Telengard 07-23-2010 08:00 PM

Quote:

Originally Posted by ghostdog74 (Post 4043518)
if you use gawk, use the --re-interval or --posix option to enable interval expressions. (ie {}).

Code:

$ whereis gawk
gawk:

As explained in the original post, system came with mawk. I do not have gawk.

Should I just assume that the Awk language is broken on Ubuntu?

wje_lq 07-23-2010 08:05 PM

There are at least two things wrong with that article.

First, it does not recognize that there are several varieties of awk out there. You"re running Kubuntu, right? My wife runs Xubunu, and her awk is mawk, just like yours, and it behaves just like yours. There seems to be no provision therein for interval expressions, and the [KX]ubuntu man page reflects this.

On Slackware, when I ask for awk, I get GNU awk. Run in the normal manner, it doesn't have interval expressions either. But if you run it with the --posix option or the --re-interval option, interval expressions are allowed. Here's my experience with GNU awk:
Code:

evans:~$ awk '/f[eo]{2}t/'
feet
foot
evans:~$ awk --posix '/f[eo]{2}t/'
feet
feet
foot
foot
evans:~$

The second thing wrong with the article is that it seems to have a superficial understanding of regular expressions.
Code:

/f[eo]{2}t/
also matches these two lines:
Code:

feot
foet


wje_lq 07-23-2010 08:12 PM

getting gawk
 
You can install gawk on your Kubuntu system. Get the package from http://packages.debian.org/stable/gawk

Telengard 07-23-2010 08:36 PM

Thank you for explaining it so well, wje_lq. That leads me to a conclusion and another question or two.

Quote:

Originally Posted by wje_lq (Post 4043530)
First, it does not recognize that there are several varieties of awk out there. You"re running Kubuntu, right? My wife runs Xubunu, and her awk is mawk, just like yours, and it behaves just like yours.

This really irks me. It means Ubuntu users can not share Awk scripts with users of other systems. Ubuntu is one of the most popular Linux distros (maybe the most popular AFAIK), and Awk is broken on it.

What should I do when writing my own scripts to share?

Quote:

Code:

/f[eo]{2}t/
also matches these two lines:
Code:

feot
foet


That is exactly what I thought when I was reading it. "[eo]" is a complete regular expression which matches either "e" or "o", and that regular expression is repeated twice. I wonder if the authors are really so naive or if they are just trying to keep it simple for beginners.

It seems I won't be able to proceed with the tutorial any further unless I install gawk. Maybe I can figure out how to make it work with the alternatives system so that it doesn't break anything which may depend upon idiosyncrasies of mawk.

:(

ghostdog74 07-23-2010 08:44 PM

i am really sceptical about gawk not being present. Why don't you try a find on your system to see if its really installed. the which command is not really foolproof in that it searches for $PATH only.

pr_deltoid 07-23-2010 09:45 PM

I was just using Kubuntu, and I know that when I ran "update-alternatives --all" awk was gawk by default. But anyways, all you have to do is install it if you don't have it. It has a higher priority and update-alternatives will set it as the default alternative automatically. If you ever have to change the alternatives, all you have to do is use "update-alternatives".

Telengard 07-23-2010 10:59 PM

Quote:

Originally Posted by prdeltoid
I was just using Kubuntu, and I know that when I ran "update-alternatives --all" awk was gawk by default. But anyways, all you have to do is install it if you don't have it. It has a higher priority and update-alternatives will set it as the default alternative automatically. If you ever have to change the alternatives, all you have to do is use "update-alternatives".

Hmm. What version of Kubuntu do you have there? I have not knowingly removed gawk, nor would I.

Code:

$ aptitude search gawk
p  gawk                            - GNU awk, a pattern scanning and processing
p  gawk-doc                        - Documentation for GNU awk

Good to know! Thank you very much for the helpful info, prdeltoid. :)

Quote:

Originally Posted by ghostdog74 (Post 4043550)
Why don't you try a find on your system to see if its really installed.

Code:

$ cd /
$ sudo updatedb
$ locate -ib gawk
/usr/share/doc/gettext-doc/examples/hello-gawk
/usr/share/locale-langpack/en_GB/LC_MESSAGES/gawk.mo
$ find bin/ lib/ opt/ sbin/ usr/ -type f -iname '*gawk*'
usr/share/locale-langpack/en_GB/LC_MESSAGES/gawk.mo
$ aptitude show gawk | grep -i state
State: not installed

Satisfied?

ghostdog74 07-23-2010 11:10 PM

Quote:

Originally Posted by Telengard (Post 4043603)
Satisfied?

the question is , are you satisfied.? If you are not, then download gawk. for normal operations, gawk is just as fast.

Telengard 07-23-2010 11:30 PM

Quote:

Originally Posted by ghostdog74 (Post 4043608)
the question is , are you satisfied.?

I was merely responding to your statement that you were skeptical about gawk not being present on my system. I was already convinced that it was not, but you were probably correct to suggest I not rely completely on whereis to determine such.

As to whether I am satisfied, I have mixed feelings about this. I don't understand why my Kubuntu system shipped with a crippled Awk interpreter whereas other people's Kubuntu systems apparently did not. Maybe I should just chalk it up to Ubuntu's fickleness and add it to my list of reasons to move to Slackware.

On the other hand I can't claim that my original post has not been satisfied, so I'll probably mark this one solved for now.

Telengard 07-24-2010 12:13 AM

Code:

$ gawk --posix '/f[eo]{2}t/'
feet
feet
foot
foot
foet
foet
feot
feot

Okay this is defintely solved now. Thank you everyone for helping. :)

pr_deltoid 07-24-2010 12:27 AM

Quote:

Hmm. What version of Kubuntu do you have there?
I was using Kubuntu 10.4 ...

EDIT:
gawk might have been installed as a dependency when I was installing something else. I installed vim, build-essential, firefox, etc. Since it had a higher priority for update-alternatives, it could've just been made the default after it was installed as a dependency.

grail 07-24-2010 01:36 AM

I am running Ubuntu 10.04 and confirm that by default gawk is not available and mawk was the only option.

Upon getting more involved with [g]awk scripting I have since removed mawk as it had other limitations too, such as no gensub.

pr_deltoid 07-24-2010 01:43 AM

I know that when I was going through the LFS book, it said to make sure that gawk was used, because mawk could not do the things that had to be done. I was using Debian, and it was using mawk by default.


All times are GMT -5. The time now is 11:36 AM.