ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
This is recreational programming. Just for "funsies." I already have a working solution using awk but it is large and clumsy. Maybe there is a clean solution using sed with a clever RegEx.
The input file is a word list, one word per line, all lower case, already in sorted order. Every word has a trailing blank.
I invite the user to enter a pattern of the form .hi..
This means find all five-letter words which have "hi" in positions 2-3. If the pattern is in variable w1 this is easily done with
Code:
grep ^$w1" " $InFile
The result is a list of words such as ...
Code:
chick
chide
chief
child
All have "hi" in positions 2-3.
Now the interesting part: it is desired to remove all those "hi" strings to produce
Code:
cde
cef
cde
cef
Remember, we are using a pattern so cannot hard-code the positions 2-3. Is there a slick way to do this?
The next question is the inverse. After fiddling around with the interim result it is desired to restore the "hi" in positions 2-3, with the "hi" in upper case.
Is the input pattern restricted to be of the form .hi...?
Must the letters in the input pattern be contiguous, or must .h.i.. and similar cases be handled as well?
Would the restoration be a separate command working on a file to which the result of the first operation was written? If so, would the user be required to give the input pattern, possibly different, as a separate operation?
Alternatively, should the original operation generate the uppercase replacement pattern and produce the restored result as part of the same process?
Perhaps it will help to describe the overall application. This coding challenge is inspired by a series of published word puzzles called Split Decisions. Refer to this example: http://www.macnamarasband.com/split/sd010.html
My present (working) solution might be an example of "doing an easy thing the hard way."
Start with these two strings: _hi__ and _ra__
Choose all word pairs from the input file which have the same letters in the blank positions. The list might be ...
$a="."
$b="hi"
$c=".. "
sed "s/^\($a\)\($b\)\($c\)$/\1\3/g" filename
The sed looks good but I am unable to make it work. Embedding it in a bash script it looks like this ...
Code:
echo; echo "Method of LQ Guru pan64."
a="."
b="hi"
c=".. "
sed "s/^\($a\)\($b\)\($c\)$/\1\3/g" $InFile >$Work7
echo "The number of lines in InFile is" $(wc -l <$InFile)
echo "The number of lines in Work7 is " $(wc -l <$Work7)
... and produces this result ...
Code:
Method of LQ Guru pan64.
The number of lines in InFile is 119557
The number of lines in Work7 is 119557
Well we could go all bash (for something different ):
Code:
#!/usr/bin/env bash
declare -A found
a=.
b='hi|ra'
c=..
regex="^($a)($b)($c)$"
space=' '
while read line
do
if [[ "$line" =~ $regex ]]
then
ind=${BASH_REMATCH[1],,}${BASH_REMATCH[3]}
word=${BASH_REMATCH[1],,}${BASH_REMATCH[2]^^}${BASH_REMATCH[3]}
[[ -n "${found[$ind]}" ]] && found[$ind]="${found[$ind]} $word" || found[$ind]="$word"
fi
done<linuxwords
for entry in "${found[@]}"
do
[[ "$entry" =~ $space ]] && echo "$entry"
done
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.