ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
This is a learning exercise done just for fun.
It is inspired by a NYTimes word puzzle called Spelling Bee
written by Patrick Berry.
Have: a file of English words called WordList.
Have: a string of 7 characters called Hive.
Want:
Step 1...
Find words of length >4 letters which use ONLY the letters in
the string "hive" and MUST use the first letter in "hive".
Step 2...
Find words which meet the criteria in Step 1,
and use ALL of the letters in "hive".
This is my "brute force" solution.
Code:
#!/bin/bash Daniel B. Martin Apr20
# Step 1...
# Find words of length >4 letters which use ONLY the letters in
# the string "hive" and MUST use the first letter in "hive".
# Step 2...
# Find words which meet the criteria in Step 1,
# and use ALL of the letters in "hive".
# File identification
Path=${0%%.*}
Only=$Path"only.txt"
All=$Path"all.txt"
WordList='/usr/share/dict/words'
hive='luenopt'
echo 'Words which use only the letters in "'$hive'"'
echo ' and contain the letter "'${hive:0:1}'".'
sed -n '/^.\{5\}/p' $WordList \
|tr -c $hive"\n" "~" \
|grep -v "~" \
|grep ${hive:0:1} \
>$Only
cat $Only
echo; echo 'Words which use all of the letters in "'$hive'".'
grep "${hive:0:1}" <$Only \
|grep "${hive:1:1}" \
|grep "${hive:2:1}" \
|grep "${hive:3:1}" \
|grep "${hive:4:1}" \
|grep "${hive:5:1}" \
|grep "${hive:6:1}" \
>$All
cat $All
echo; echo "Normal end of job."; echo; exit
It produces this result:
Code:
Words which use only the letters in "luenopt"
and contain the letter "l".
elope
letup
lotto
nettle
opulent
outlet
pellet
people
pollen
pollute
pullet
pullout
topple
tulle
tunnel
Words which use all of the letters in "luenopt".
opulent
Normal end of job.
I suspect there is a cleaner better faster way.
Ideas? Suggestions?
Note that the problem statement calls for words of length >4 letters which contain the first letter in "hive" but your solution produced words which begin with that letter.
A one-liner would be an impressive solution. Perhaps you can rework yours.
Same as before. The output file contains lots of words containing letters which are not in the hive. This is a small part of the result to illustrate the problem...
Well, what's wrong with the Python script suggested by pan64 above? Sure, you could do it as a one-liner, but it would look just as ugly as five greps chained one after another:
did you check the solution written in python? There is a tricky function named sort_it inside.
I will help you to rewrite this script in [pure] bash - if you wish. It is quite simple, the only exception is that function. I don't know if there was any ready-made tool doing the same, so need to be implemented (either this or something else to do the work).
Thank you, all, for references to Python. I don't know that language and am still working toward mastery of Linux commands such as grep.
I wrote a solution to this "hive" problem in awk. I'll post that for review and comment after arriving at an optimal solution to that shown in post #1 of this thread.
To the second, you also could do something like this:
Code:
#!/bin/bash
wordlist=/usr/share/dict/words
hivestring=luenopt
declare -a hive=( $(sed 's/./& /g' <<<$hivestring) )
grep -E "^[$hivestring]{5,}$" "$wordlist" |
while read word
do
for letter in ${hive[@]}
do
[[ $word =~ $letter ]] && continue 1 || continue 2
done
echo $word
done
We are getting closer to an ideal solution!
This code ...
Code:
WordList='/usr/share/dict/words'
hive='luenopt'
echo 'Words which use only the letters in "'$hive'"'
echo ' and contain the letter "'${hive:0:1}'".'
grep l $WordList \
|grep -E "^[$hive]{5,}$" >$Only
cat $Only
echo; echo 'Words which use all of the letters in "'$hive'".'
grep -v -P '(.).*\1' <$Only \
|sed -n '/^.\{7\}/p'
>$All
cat $All
... produced this result ...
Code:
Words which use only the letters in "luenopt"
and contain the letter "l".
elope
letup
lotto
nettle
opulent
outlet
pellet
people
pollen
pollute
pullet
pullout
topple
tulle
tunnel
Words which use all of the letters in "luenopt".
opulent
To polish this apple even more,
- can the two grep commands in step 1 be combined?
- can the grep RexEx in step 2 be changed to produce
only words of >6 characters, and then eliminate the sed?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.