LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash shell script - variable magic needed (no idea how to approach this) (https://www.linuxquestions.org/questions/programming-9/bash-shell-script-variable-magic-needed-no-idea-how-to-approach-this-420213/)

arrenlex 02-28-2006 12:17 AM

Bash shell script - variable magic needed (no idea how to approach this)
 
I'm trying to write a shell script that would determine the rhyme scheme for a poem (ABAB, AABB, ABCD, whatever).

I've already written a shell script that determines whether two words rhyme and returns a proper exit code for me to work with.

I've already written a shell script that strips the last word out of each line of a given block of text and saves these last words as a new file, for easy working.

What I'm having problems with is checking rhymes against previous words to see if a line rhymes with something before it. Let me explain.

For example, a simple poem, rhyme scheme ABCB:

Code:

On a couch
There was a cat
It was black
And caught a rat

So, running my stripping script, I get:
Code:

couch
cat
black
rat

Saved to the file /tmp/lines

Now the ideal thing would be to pass "couch" through the analysis script, and the script would assign the word "couch" as the variable "A" (for A rhyme). It would then activate a variable that makes it check all subsequent words against previous words, for rhymes. So in essence, this is what happens:

1. "couch" is the first word. It is therefore A.
2. "cat" is the second word. Check for rhymes with any previous lines. "cat" does not rhyme with "couch" so it is not A. It is therefore B.
3. "black" is the third word. Check for rhymes with any previous lines. "black" does not rhyme with "couch", so it is therefore not A. "black" does not rhyme with "cat", so it is therefore not B. It is therefore C.
4. "rat" is the fourth word. Check for rhymes with any previous lines. "rat" does not rhyme with "couch". It is therefore not A. "rat" DOES rhyme with "cat". It is therefore B.

The script should be something like this:
Code:

#!/bin/sh
while read word; do
if [ "$begin" != "1" ] ; then #not yet begun, this is A
  echo "A $word"
  begin="1"
  A=${word}
else
  Determine what rhyme patterns have already been initialised.
  Check $word for rhyme with $[ABCDEFGHIJK....whatever it's up to now.]
  If rhymes with a previous line, assign whatever letter we're up to that line.
  If does not rhyme with a previous line, assign the next letter we're up to to that line.

fi
done

The output I'm looking to eventually end up with, if the cat poem is in the file "poem", is:
Code:

$./script poem
A On a couch
B There was a cat
C It was black
B And caught a rat

Can any of you supergenius gurus help me? Please keep it simple, too, because I'm rather new to shell scripting. Thanks! =)

dubya 02-28-2006 08:55 AM

I'm no genius, but my first thought was to create two arrays; one holding the word, and another holding the rhyme type (A, B, C...) so that the indices line up. That way, you have:
Code:

$words[0] = couch  $rhyme_name[0] = 'A'
$words[1] = cat    $rhyme_name[1] = 'B'
$words[2] = black  $rhyme_name[2] = 'C'
$words[3] = rat    $rhyme_name[3] = 'B'

When you read a new word, just loop though all $words and if they rhyme, assign its corresponding $rhyme_name with the same value as the rhyming word, otherwise, assign it the next letter. You could probably use ascii values and just increment and convert as you go.

Maybe I misunderstood your question and am way off, but I hope this helps.

arrenlex 02-28-2006 12:17 PM

Quote:

Please keep it simple, too, because I'm rather new to shell scripting. Thanks! =)
Arrays, eh? Cycle through the words. Increment ASCII values. It's all greek to me. xD Could you explain that a bit more clearly? Maybe with examples? Thank you for helping.

dubya 02-28-2006 02:18 PM

Oh, I assumed you have some programming experience since normally people who are writing scripts have some sort of programming background; most of the things I mentioned would have been covered in a first year programming course. I can't really explain all of the things I mentioned in great detail because that would take a while. Instead, it might be a good idea to invest some time in reading a bash scripting tutorial. Particularly, you'd want to focus on sections involving arrays and for loops. A Google search should return plenty of good ones.

I'll try to be more clear, but you may still not understand until you read up on arrays and for loops. Here's the algorithm of my general idea.
Code:

# i represents the current line starting at 0
# word is an array of words from the end of the lines
# word_rhyme is an array holding the rhyme type of that line, ie. 'A', 'B', etc.
# current_rhyme is simply used to keep track of what new rhyme type we are on

current_rhyme <= 'A'

for i <= 0 upto number_of_lines
start
    if word[i] rhymes with any other preceding word
    then
        word_rhyme[i] <= word_rhyme[rhyming_word_index]
    else
        word_rhyme[i] <= current_rhyme
        current_rhyme = current_rhyme + 1  (to 'B')
end

Please, please note that this is NOT a bash script that I expect to run, just a general algorithm of my idea. And for a quick explanation on ascii values, go here. They may not be necessary to know, but if you want to use letters and make your input poems big, you probably will need to know this. I hope all this helps.

arrenlex 03-01-2006 08:32 PM

Thank you for helping.

First year programming course? I'm only in high school.

Took half a year of beginner ("Hello World"-equivalent) Visual Basic programming two years ago, and after that I've been finding out by myself, leaning heavily on Google. Never read a single Bash tutorial at all; again, things I've found out from trial and error, some logic, and a lot of Google.

Quote:

if word[i] rhymes with any other preceding word
How would this part be achieved? Also, is "word[i]" actual working notation for this "array" doohickey or is it just something you've inserted to illustrate the point?

Thanks again.

dubya 03-01-2006 09:07 PM

That's cool that you're learning this early; I didn't start bash scripting until a few months ago. Anyway, word is the name of the array. This can be anything, whatever you declare it to be. i is just the index to access that element of the array. Check this out on bash script arrays. That and "for" loops is what you'll need to find out if a rhyme exists with a preceding word. Essentially, you count from the first line up to the current and use that count to access elements of the array and compare them with the current word.

I know this sounds really confusing, but it's too difficult for me to explain without going on forever explaining arrays and for loops.

chrism01 03-02-2006 12:30 AM

Here's a couple of good links to learn shell prog with:
http://rute.2038bug.com/index.html.gz
http://www.tldp.org/LDP/abs/html/index.html
Also prob worth buying the O'Reilly book if you are serious.

arrenlex 03-02-2006 12:34 AM

Thanks for those links, but I've never in my life actually sat down and RTFM. I've always glanced at the first two pages, started trying stuff therein, running into problems and googling for the solutions. That's what I did with dubya's link; found the word "array" in the page, glanced at the code example, and made a simple shell script to discover how arrays worked. Already got my shell script working in a rudimentary form; trying to squash an odd bug, now. I'm not a manual person, that's just not how I do things. Give me a heavily commented example, a temp directory, and Google, and I'll figure it out.

Thanks, though.

arrenlex 03-02-2006 05:41 PM

It's done! It works perfectly now. It just correctly transcribed Edgar Allan Poe's The Bells, the same as I did it by hand and about ten times as quickly!

I realise this code is probably very awkward, naïve and temeramental. It would be great if some of you told me how to clean it up a bit.

I don't suppose there's any chance it could be made into some kind of cool web java\php\whatever application you could use on the internet? =D *nudge nudge*

I do have one serious question, though. I had to use kdialog in one instance because it wouldn't let me use the read command. Why did it not let me use the read command? It just kept moving forward with the next line, instead of pausing, using that line as the "input" for the read command, which of course screws everything up. I realise not all of you have kdialog, so it is a problem. How can I get rid of it?

Anyway, it consists of four scripts. They must all be in the same directory, executed as "./wrapper <poemfile>", where poemfile is a plain text poem and does not contain any lines you don't want scanned into the rhyme scheme except blank lines (i.e. no lines marking out stanza IV and such). The options -r and -c are allowed, used together as -rc, which cause identical words to be treated as NON-RHYMING, for -r, and which makes indexing of the poem continuous rather than by stanza, which is -c. --help or -h is also availab.e

To be saved as "linestrip": strips the last word out of a line and removes punctuation. I had problems integrating this into the process script because it didn't like the ` character that's in the first sed statement when used as "stripped=`echo $line | sed ..........so on`" I would love it if someone told me how to integrate it.
Code:

#!/bin/sh
echo $@ | sed 's/[!-/:-@[-`{-~]//g' | sed 's/[!-~]//g' | sed 's/-//g' | sed 's/,//g' | sed 's/;//g' | awk '{print $NF}'

To be saved as "rhyme": determines whether two words rhyme using rhyme.poetry.com
Code:

#!/bin/sh

mkdir /tmp/.rhyme 2> /dev/null

Word1=$1
Word2=$2

if [ "$Word1" = "$Word2" ] ; then #Words are identical. Rhyme.
        exit 0
fi

if [ ! -e /tmp/.rhyme/rough.$Word1 ] ; then #file does not yet exist for word 1; download it
        wget -q -O/tmp/.rhyme/rough.$Word1 "http://rhyme.poetry.com/r/rhyme.cgi?Word=${Word1}&typeofrhyme=perfect&org1=syl&org2=sl&cbr=pc"
fi

if grep "no perfect rhymes were found" /tmp/.rhyme/rough.$Word1 >/dev/null ; then # Word 1 was found, but has no rhymes
        exit 2
fi

if grep "not found in this dictionary" /tmp/.rhyme/rough.$Word1 >/dev/null ; then # Word 1 not found"
        exit 2
fi

cat /tmp/.rhyme/rough.$Word1 | grep "r/rhyme.cgi?cbr=pc&Word=" | sed 's/<[^>]*>//g' | sed 's/,//g' | sed 's/ //g' > /tmp/.rhyme/rhyme.$Word1

if cat /tmp/.rhyme/rhyme.$Word1 | grep -ix "$Word2">/dev/null ; then # Rhymes
        exit 0
else
        if [ ! -e /tmp/.rhyme/rough.$Word2 ] ; then #file does not yet exist for word 2; download it
                wget -q -O/tmp/.rhyme/rough.$Word2 "http://rhyme.poetry.com/r/rhyme.cgi?Word=${Word2}&typeofrhyme=perfect&org1=syl&org2=sl&cbr=pc"
        fi

        if grep "no perfect rhymes were found" /tmp/.rhyme/rough.$Word2 >/dev/null ; then # Word 2 was found, but has no rhymes.
                exit 3
        fi

        if grep "not found in this dictionary" /tmp/.rhyme/rough.$Word2 >/dev/null ; then # Word 2 not found
                exit 3
        fi

# If we got to this point okay, then words both exist but do not rhyme
        exit 1
fi

To be saved as "process": does the grunt work. Uses all other scripts except wrapper.
Code:

#!/bin/sh

#Set up array necessary for conversion between letters and numbers, so that letter[1]=A etc.
for subject in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 ; do
        number=`expr index ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz123456789 $subject`
        letter[number]=$subject
done

#Begin processing
while read line; do                        #while reading lines, each read as variable "$line", do:

        word=`./linestrip $line`        #strip the last word from the line and remove punctuation

if [ "$word" = "" ] ; then                # A blank line; skip it
        if [ "$MODE" = "continuous" ] ; then
                echo ""
        else
                echo ""
                begun=0
        fi
               
else                                        # Not a blank line; process it
        if [ "$begun" != "1" ] ; then                # analysis not yet begun, this line is index 1 (A)
                        echo "A $line"                #echo word with A in front of it
                              begun="1"                #initialise $begun variable so this block is not run again
                        index[1]="${word}"        #define $index as an array; value 1 is the word under A
                        index_last="1"                #last existing index letter is 1 at first line
                        current_position="1"        #position in poem of the word currently being looked at (is 1)

        else                            # Analysis has already begun
                current_position=`expr $current_position + 1`        # Move down a line to look at the next word
                match_found=0                                        # Reset the match_found variable so as not to confuse the until loop
                index_being_scanned=1                                # Define the word to be scanned against (initially) as the first word

                until [ "$index_being_scanned" -gt "$index_last" -o "$match_found" = "1" ] ; do                # until the scanning runs past the end of existing indices or a match is found, do:
                        ./rhyme ${index[index_being_scanned]} $word                                      # use rhyme script to determine if our current word rhymes with the word under the index being scanned
                        returncode=$?                                                                        # and get the return code of this operation
                        if [ "$returncode" = "0" ] ; then                                        # The word we are looking at rhymes with the index we are scanning. Use this index for our word.
                                index_letter=${letter[index_being_scanned]}                        # Convert numerical index to letter using array
                                echo "$index_letter $line"                                        # echo the index found to rhyme, followed by the word we found that rhymes with it
                                match_found=1                                                                # Trip the match found switch for the until loop to move on

                        elif [ "$returncode" = "1" ] ; then                                        # The word we are looking at does not rhyme with the index we are scanning. Scan the next index.
                                index_being_scanned=`expr $index_being_scanned + 1`                        # Increase index being scanned by 1 to move on to next index

                        elif [ "$returncode" = "2" ] ; then                                        # Word in index was not found; offer an alternate search
                                index[index_being_scanned]=`kdialog --inputbox "${index[index_being_scanned]} was not found in the rhyming dictionary. Enter a word rhyming with ${index[index_being_scanned]} for an alternate search, or cancel button to assign next index."` # Use kdialog and ask user for an alternate search (why does read not work here??)
                                if [ "${index[index_being_scanned]}" = "" ] ; then                                        # User decided to assign next index
                                        index_last=`expr $index_last + 1`                                        # Increase the index of the last defined rhyming pattern by one
                                        index[index_last]="${index[index_being_scanned]}"                                                # The word under this new index is the word we found to be new
                                        index_letter=${letter[index_last]}                                        # Convert the index number into an index letter
                                        echo "$index_letter $line"                                                # Echo this new index followed by our word
                                        match_found=1                                                                # Pretend match was found to move onto the next line                                       
                                else                                                                # User has entered a term
                                        index_being_scanned=1                                                # Re-start search at first word
                                        match_found=0                                                        # Make sure the until loop runs again
                                fi
               
                        elif [ "$returncode" = "3" ] ; then                                        # Word we are looking at was not found; offer an alternate search
                                word=`kdialog --inputbox "${word} was not found in the rhyming dictionary. Enter a word rhyming with $word for an alternate search, or cancel button to assign next index."` # Use kdialog and ask user for an alternate search (why does read not work here??)
                                if [ "$word" = "" ] ; then                                        # User decided to assign next index
                                        index_last=`expr $index_last + 1`                                        # Increase the index of the last defined rhyming pattern by one
                                        index[index_last]="$word"                                                # The word under this new index is the word we found to be new
                                        index_letter=${letter[index_last]}                                        # Convert the index number into an index letter
                                        echo "$index_letter $line"                                                # Echo this new index followed by our word
                                        match_found=1                                                                # Pretend match was found to move onto the next line                                       
                                else                                                                # User has entered a term
                                        index_being_scanned=1                                                # Re-start search at first word
                                        match_found=0                                                        # Make sure the until loop runs again
                                fi       

                fi
                done

                if [ "$index_being_scanned" -gt "$index_last" ] ; then                                # End was reached without match being found; word rhymes with no previous index
                        index_last=`expr $index_last + 1`                                        # Increase the index of the last defined rhyming pattern by one
                        index[index_last]="$word"                                                # The word under this new index is the word we found to be new
                        index_letter=${letter[index_last]}                                        # Convert the index number into an index letter
                        echo "$index_letter $line"                                                # Echo this new index followed by our word
                fi
        fi
  fi
done

To be saved as "wrapper": allows the analysis to be begun in the format "./wrapper $file" rather than "cat $file | ./process". Adds option capability.
Code:

#!/bin/sh

if [ "$1" = "-c" ] ; then
        export MODE=continuous
        file=$2

elif [ "$1" = "-r" ] ; then
        export IDENT=norhyme
        file=$2

elif [ "$1" = "-rc" -o "$1" = "-cr" ] ; then
        export MODE=continuous
        export IDENT=norhyme
        file=$2
       
elif [ "$1" = "--help" -o "$1" = "-h" ] ; then
        echo "Tool to determine rhyme scheme for a poem."
        echo "USEAGE: ./wrapper <poemfile>"
        echo "  where <poemfile> is a cleartext poem consisting only of lines you want included in the scheme."
        echo "  (i.e. no "Stanza IV" lines.)"
        echo ""
        echo "OPTIONS"
        echo "        -c                Continuous mode: don't reset index back to A at every stanza"
        echo "        -r                Do NOT treat repeated words as rhymes (so that a word does not rhyme with itself)"
        echo "        -rc or -cr        Activate both the -c and -r options; CANNOT be used as \"./wrapper -r -c [FILE]\""
        echo "        -h or --help        Display this text."
        echo ""
        echo "Written by Arren Lex on 2006-03-02; you are welcome to modify all portions under the terms of the GPL."
        echo ""
        exit 0
else
        file=$@
fi

cat $file | ./process

Thanks again for your help, dubya! Your tip on the array thing was great. Again, any suggestions for kdialog?


All times are GMT -5. The time now is 01:36 AM.