egrep question

SSlide · 11-26-2009, 01:05 PM

Hi,

I'm new to linux and have been figuring out some of the basics lately. Though I'm stuck with egrep now.

I need to extract all words from a dictionnary list which have an odd number of consonants. The thing is I can't figure out how to do that since I'm only aware of a way to count letters/numbers consecutively.

for instance:

egrep '[^aeiou]{5}' /usr/share/dict/dutch

so my question is how can i adjust my regex to make it not look for those consonants consecutively?

thanks!

Tinkster · 11-26-2009, 01:38 PM

Hi,

welcome to LQ!

And I don't think you can. You're going to have to use a two-step approach.

One (admittedly not very elegant) solution:

Code:

sed 'p; s/[^aeiou]//g' /usr/share/dict/words | awk '{if(NR>1){if(NR%2==0){if(length($0) == 8 ){print previous}}};previous=$0}' 
anticoagulation
antiredeposition
artificialities
authoritarianism
autobiographical
autobiographies
autocorrelation
autofluorescence
autosuggestibility
availabilities
axiomatization
axiomatizations
beautifications
canonicalization
conceptualization
conceptualizations
counterintuitive
counterrevolution
disqualification
electroencephalogram
electroencephalograph
electroencephalography
familiarization
heterogeneousness
incompatibilities
industrialization
inevitabilities
inhomogeneities
initialization
initializations
institutionalize
institutionalized
institutionalizes
institutionalizing
intercommunication
miniaturization
nationalization
parameterization
parameterizations
rationalization
rationalizations
reproducibilities
revolutionaries
telecommunication
telecommunications
unidirectionality

Cheers,
Tink

SSlide · 11-27-2009, 08:24 AM

Thank you sir, I just needed to know if there was a way to put it in the regex. Now I know there's not, so I'll use another approach similar to yours.

Tinkster · 11-27-2009, 10:59 AM

Welcome ... but I didn't say it's impossible. I just said I don't
think so. Chances are there's a regex guru just around the corner
who knows a fabulously cunning way of doing it :}

Kenhelm · 11-28-2009, 03:08 AM

Try

Code:

vow="[aeiou'./-]"                   # A vowel etc. character
con='[bcdfghjklmnpqrstvwxyz]'       # A consonant character

egrep -i "^$vow*$con$vow*($con$vow*){2}*$" /usr/share/dict/words

The extra './- characters with the vowels are there to match entries in the dictionary list such as:-

contrib.
rock-'n'-roll
roll-on/roll-off
Y.M.C.A.

SSlide · 11-28-2009, 04:36 AM

Damn that changes my perspective

Thanks a lot, I learned a lot from that little piece of code.