[SOLVED] Seven-letter words with only one vowel

danielbmartin · 09-14-2022, 09:39 AM

This is my solution.

Code:

echo; echo 'Find seven-letter words which have only one vowel.'
     Path=${0%%.*}
  OutFile=$Path"out.txt"
 WordList='/usr/share/dict/words'
   Sevens=$Path"sevens.txt"
 egrep "^.{7}$" $WordList       \
|grep -v '[^a-z]' >$Sevens
 sed 's/[^aeéiouy]//' $Sevens   \
|sed 's/[^aeéiouy]//'           \
|sed 's/[^aeéiouy]//'           \
|sed 's/[^aeéiouy]//'           \
|sed 's/[^aeéiouy]//'           \
|sed 's/[^aeéiouy]//'           \
|paste -d" " $Sevens -          \
|egrep "^.{9}$"                 \
>$OutFile

It works but seems like brute force. Is there a better, faster, cleaner method? Please advise.

Daniel B. Martin

.

danielbmartin · 09-14-2022, 09:55 AM

I searched my personal library and found this one.

Code:

 egrep "^.{7}$" $WordList       \
|grep -v '[^a-z]' >$Sevens

tr -dc '[aeiouéèy\n]' <$Sevens  \
|paste $Sevens -                \
|sed -nr '/^.{9}$/p'            \
>$OutFile

Daniel B. Martin

.

Turbocapitalist · 09-14-2022, 11:18 AM

There don't seem to be any modules or built-in character sets which include a comprehensive group of vowels, at least not externally. However, CPAN's Unicode::Normalize can convert to ASCII and then you can check with a pattern similar to one of the ones which you have used above.

Code:

#!/usr/bin/perl                                                                 
                                                                     
use Unicode::Normalize;                                                                                
use strict;                                                                     
use warnings;

while (my $s = <>) {
    if ($s =~ m/^\w{7}$/) {
        my $d = NFKD($s);
        if ($d =~ m/[aeiou]{1}/ && $d !~ m/[aeiou](?=.*[aeiou])/i) {
            print $s;
        }
    }
}

exit(0);

That reads in a line and checks if it contains a seven-letter string. Then it normalizes it to ASCII and checks for the presence of a single vowel, except for y in English and w in Welsh etc.

I think that's about as short as it can be with the Unicode constraint. If you are sticking with ASCII then sed or grep would be enough.

Edit: The old version allowed up to one vowel but did not require one. The modification requires exactly one vowel.

grail · 09-14-2022, 10:56 PM

Code:

awk -F'[aeiou]' '/^\w{7}$/ && NF == 2' $WordList

pan64 · 09-15-2022, 12:18 AM

you can also try to
1. filter 7 letter words,
2. remove all the non-wovels (like your sed)
3. check length again, should be 1
but the post #3 and #4 are definitely much better. I would avoid using pipe chains if possible.

astrogeek · 09-15-2022, 01:28 AM

"Holy Grail, Batman!"

I think I'll keep my inelegant solution to myself!

MadeInGermany · 09-15-2022, 07:12 AM

Code:

awk '
  { u0=toupper($0) }
  length==7 && gsub(/[BCDFGHJKLMNPQRSTVWXZ]/, "", u0)==1
' "$WordList"

If you want to keep+print the vowel:

Code:

awk '
  { u0=toupper($0) }
  length==7 && gsub(/[^BCDFGHJKLMNPQRSTVWXZ]/, "", u0)==6 {
    print u0,$0
  }
' "$WordList"

sundialsvcs · 09-15-2022, 11:59 AM

Hint:

Filter for words which do match [aeéiouy] in a suitable words-file, then pipe the results to keep only those which do not match [aeéiouy].*[aeéiouy]. Finally, filter out all which are not seven-letter words. (Rearrange the order of piped operations as you like.)

A "shell one-liner," given a suitable "words file," can solve this. No programming is required.

Q.E.D.