ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
key=deront
sed -rn "h; s/.*/&#$key/;:a s/(.)(.*#.*)\1/\2/;ta;/[^#]/!{g;p}" <$InFile
Now that is just elegant simplicity!
I use sed every day and think I am proficient, but I had to pull the book off the shelf and dust off the brain cell to follow this - and it isn't even obscure!
Code:
sed -rn
h;
s/.*/&#$key/;
:a
s/(.)(.*#.*)\1/\2/;
ta;
/[^#]/!{g;p}
My compliments sir! Simple sed well applied!
Last edited by astrogeek; 09-22-2015 at 04:14 PM.
Reason: Added indented code block
I concocted this problem as a learning exercise, and used a dictionary file as the InFile. Now, having a closer look at it, I realize it identifies all anagrams of the Key Word which are English words.
I integrated the superb solution posted by millgates and am pleased with the brevity and speed. For those who might like to play with it, here is my program in its entirety.
Code:
#!/bin/bash Daniel B. Martin Sep15
# To execute this program, launch a terminal session and enter:
# bash /home/daniel/Desktop/LQfiles/dbm1502.bin
#
# Find all anagrams of a user-specified Key Word which are English words.
# Keywords: anagram anagrams
# File identification
Path=${0%%.*}
OutFile=$Path"out.txt"
# This European Scrabble word list was downloaded from:
# http://www.freescrabbledictionary.com/sowpods/download/sowpods.txt
WordList="/home/daniel/Desktop/LQfiles/sowpods.txt"
# Prompt for user input.
echo; echo -n "Enter a Key Word ==> "; read KW
# For debugging convenience: the default value of KW is "lotipac".
if [ "$KW" == "" ]; then KW='lotipac'; fi
# Method of LQ member millgates.
grep "^$(tr "a-z" "." <<<$KW)$" $WordList \
|sed -rn "h; s/.*/&#$KW/;:a s/(.)(.*#.*)\1/\2/;ta;/[^#]/!{g;p}" >$OutFile
echo "Anagrams of" $KW "are:"; cat $OutFile; echo "End Of File ("$(wc -l <$OutFile)" lines)"
echo; echo "Normal end of job."; echo; exit
Suggested improvements are welcomed.
This is a sample execution ...
Code:
daniel@daniel-desktop:~$ bash /home/daniel/Desktop/LQfiles/dbm1502.bin
Enter a Key Word ==> lotipac
Anagrams of lotipac are:
capitol
coalpit
optical
topical
End Of File (4 lines)
Normal end of job.
daniel@daniel-desktop:~$
The original problem statement specified 6-character words. This implementation is flexible in that respect. Here is a sample execution with a 5-character Key Word.
Code:
daniel@daniel-desktop:~$ bash /home/daniel/Desktop/LQfiles/dbm1502.bin
Enter a Key Word ==> redoc
Anagrams of redoc are:
coder
cored
credo
decor
End Of File (4 lines)
Normal end of job.
daniel@daniel-desktop:~$
key=deront
sed -rn "h; s/.*/&#$key/;:a s/(.)(.*#.*)\1/\2/;ta;/[^#]/!{g;p}" <$InFile
I hereby join the choir of praise. My hat is off.
Although I use sed (almost) daily, this is completely incomprehensible in its brilliance, and perhaps also the reason why so many geeks have long and unruly beards.
You've probably figured it out by now, but anyway... astrogeek nicely split the code into lines, so let's just add a few comments:
Code:
sed -rn
h; # store the original pattern in hold space; we will need it later
s/.*/&#$key/; # append a # and the key to the pattern
:a # start loop
s/(.)(.*#.*)\1/\2/; # find pairs of the same character that have a # between them,
# i.e. one is in the pattern and the other one is in the key
ta; # end loop when no match is found
/[^#]/!{g;p} # at this point, if the string still contains anything else than a #
# it means the characters in both parts (the key and the pattern) did
# not match up, If that is not the case, copy the original pattern
# back from the holding space and print it.
This post describes an exploration of performance enhancers.
Program dbm1503A is the excellent one-liner posted by millgates.
Program dbm1503B is the same sed preceded by a grep which eliminates all InFile lines which are not of the same length as the Key Word.
Program dbm1503C is the same as dbm1503B with code added to weed out InFile lines which contain letters not present in the Key Word. There ought to be a way to combine the tr and grep into a single command but I wasn't able to figure out the syntax. Suggestions are invited.
The time for a single execution is not perfectly repeatable so I tried to even things out by using a "do it 5 times" loop in each program.
The programs are ...
Code:
#!/bin/bash Daniel B. Martin Sep15 dbm1503A
Path=${0%%.*}
OutFile=$Path"out.txt"
WordList="/home/daniel/Desktop/LQfiles/sowpods.txt"
KW='lotipac'
echo "Program dbm1503A... Method of LQ member millgates as originally posted."
COUNTER=0
until [ $COUNTER -eq 5 ]; do
sed -rn "h; s/.*/&#$KW/;:a s/(.)(.*#.*)\1/\2/;ta;/[^#]/!{g;p}" $WordList >$OutFile
let COUNTER++
done
echo "Normal end of job."; echo; exit
#!/bin/bash Daniel B. Martin Sep15 dbm1503B
Path=${0%%.*}
OutFile=$Path"out.txt"
WordList="/home/daniel/Desktop/LQfiles/sowpods.txt"
KW='lotipac'
echo "Program dbm1503B... Method of LQ member millgates with one improvement."
COUNTER=0
until [ $COUNTER -eq 5 ]; do
grep "^$(tr "a-z" "." <<<$KW)$" $WordList \
|sed -rn "h; s/.*/&#$KW/;:a s/(.)(.*#.*)\1/\2/;ta;/[^#]/!{g;p}" >$OutFile
let COUNTER++
done
echo "Normal end of job."; echo; exit
#!/bin/bash Daniel B. Martin Sep15 dbm1503C
Path=${0%%.*}
OutFile=$Path"out.txt"
WordList="/home/daniel/Desktop/LQfiles/sowpods.txt"
KW='lotipac'
echo "Program dbm1503C... Method of LQ member millgates with two improvements."
COUNTER=0
until [ $COUNTER -eq 5 ]; do
grep "^$(tr "a-z" "." <<<$KW)$" $WordList \
|tr "$(tr -d "$KW" <<<"abcdefghijklmnopqrstuvwxyz")" "~" \
|grep -v "~" \
|sed -rn "h; s/.*/&#$KW/;:a s/(.)(.*#.*)\1/\2/;ta;/[^#]/!{g;p}" >$OutFile
let COUNTER++
done
echo "Normal end of job."; echo; exit
These are the timings ...
Code:
Program dbm1503A... Method of LQ member millgates as originally posted.
Normal end of job.
real 1m22.372s
user 1m22.333s
sys 0m0.028s
daniel@daniel-desktop:~$ time bash /home/daniel/Desktop/LQfiles/dbm1503B.bin
Program dbm1503B... Method of LQ member millgates with one improvement.
Normal end of job.
real 0m7.382s
user 0m10.229s
sys 0m0.048s
daniel@daniel-desktop:~$ time bash /home/daniel/Desktop/LQfiles/dbm1503C.bin
Program dbm1503C... Method of LQ member millgates with two improvements.
Normal end of job.
real 0m1.358s
user 0m1.404s
sys 0m0.072s
In your original post you said you don't like loops but that solution has loops.
Post #1 said, "As a matter of personal coding style I strive to avoid explicit loops." That's true. Strive = Try, and I tried. I was unable to create a no-loop solution.
Quote:
Also you say "(so far)" but this thread is marked solved.
True. I (reluctantly) accepted the idea that there is no no-loop solution, and marked the thread SOLVED. However I will be delighted if a no-loop solution is posted.
Quote:
I didn't see the code but the perl version is listed as the fastest solution.
I don't know perl. I tried to run (and time) the posted perl solution but failed with a syntax error. Maybe I'll learn perl and python some day. At present I am still learning awk and the many powerful Linux commands.
Quote:
I do not see the wordlist??? Did I miss it?
This was given in post #17, in the code. To repeat it here ...
Code:
# This European Scrabble word list was downloaded from:
# http://www.freescrabbledictionary.com/sowpods/download/sowpods.txt
WordList="/home/daniel/Desktop/LQfiles/sowpods.txt"
Three variations based on the excellent sed solution posted by millgates, together with timings, are shown in post #22. Note that my timings used a "do it 5 times" loop. Keep this in mind if you make timings on your machine.
If you come up with something even better please post it here. We learn from each other!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.