LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Compare array index number against element in bash (https://www.linuxquestions.org/questions/programming-9/compare-array-index-number-against-element-in-bash-910753/)

rewtnull 10-29-2011 04:03 AM

Compare array index number against element in bash
 
Hi.

Consider this snippet of code in bash:

Code:

shopt -sq nocasematch
intern=( A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 . , ? )
input=($(echo "apple" | sed -e 's/[[:alnum:].,\?]/& /g' ))

count="0"

for element1 in ${input[@]}; do
    for element2 in ${intern[@]}; do
        if [[ ${element1} == ${element2} ]]; then
            echo "index: ${count} element: ${intern[${count}]} input: ${input[${count}]}"
            break
        fi
    done
    (( count++ ))
done

What I would like to do is to somehow save the index numbers of my matches on $input (which has variable element contents), and then compare those index numbers against the index numbers of $intern. In this example, I would like 'apple' to become '0 15 15 11 4'. Why I want to match against index number is because later on I'd like to substitute the results against elements in a third array, which does not contain regular alphabetical characters.

Is this possible, and if so, how could I achieve this?

I hope my attempt at an explanation makes sense :)

Cheers!

grail 10-29-2011 04:57 AM

Quote:

I hope my attempt at an explanation makes sense
Not for me :(

The following makes no sense to me:
Quote:

In this example, I would like 'apple' to become '0 15 15 11 4'
I understand that those are the index numbers of the letters in apple as related to intern array, but do not understand what you mean by 'become'?

You then mention about a third array, so maybe you could explain the big picture as your current solution may not necessarily be the best.

Nominal Animal 10-29-2011 06:07 AM

How about
Code:

#!/bin/bash

INPUT="Apple"
OUTPUT=()
CIPHER=""

OUTSET="@%/zyxwvutsrqponmlkjihgfedcba"

echo "Input: ${INPUT}"

while [ ${#INPUT} -gt 0 ]; do
    case "${INPUT:0:1}" in
    A|a) OUTPUT=("${OUTPUT[@]}"  0) ;;
    B|b) OUTPUT=("${OUTPUT[@]}"  1) ;;
    C|c) OUTPUT=("${OUTPUT[@]}"  2) ;;
    D|d) OUTPUT=("${OUTPUT[@]}"  3) ;;
    E|e) OUTPUT=("${OUTPUT[@]}"  4) ;;
    F|f) OUTPUT=("${OUTPUT[@]}"  5) ;;
    G|g) OUTPUT=("${OUTPUT[@]}"  6) ;;
    H|h) OUTPUT=("${OUTPUT[@]}"  7) ;;
    I|i) OUTPUT=("${OUTPUT[@]}"  8) ;;
    J|j) OUTPUT=("${OUTPUT[@]}"  9) ;;
    K|k) OUTPUT=("${OUTPUT[@]}" 10) ;;
    L|l) OUTPUT=("${OUTPUT[@]}" 11) ;;
    M|m) OUTPUT=("${OUTPUT[@]}" 12) ;;
    N|n) OUTPUT=("${OUTPUT[@]}" 13) ;;
    O|o) OUTPUT=("${OUTPUT[@]}" 14) ;;
    P|p) OUTPUT=("${OUTPUT[@]}" 15) ;;
    Q|q) OUTPUT=("${OUTPUT[@]}" 16) ;;
    R|r) OUTPUT=("${OUTPUT[@]}" 17) ;;
    S|s) OUTPUT=("${OUTPUT[@]}" 18) ;;
    T|t) OUTPUT=("${OUTPUT[@]}" 19) ;;
    U|u) OUTPUT=("${OUTPUT[@]}" 20) ;;
    V|v) OUTPUT=("${OUTPUT[@]}" 21) ;;
    W|w) OUTPUT=("${OUTPUT[@]}" 22) ;;
    X|x) OUTPUT=("${OUTPUT[@]}" 23) ;;
    Y|y) OUTPUT=("${OUTPUT[@]}" 24) ;;
    Z|z) OUTPUT=("${OUTPUT[@]}" 25) ;;
    0)  OUTPUT=("${OUTPUT[@]}" 26) ;;
    1)  OUTPUT=("${OUTPUT[@]}" 27) ;;
    2)  OUTPUT=("${OUTPUT[@]}" 28) ;;
    3)  OUTPUT=("${OUTPUT[@]}" 29) ;;
    4)  OUTPUT=("${OUTPUT[@]}" 30) ;;
    5)  OUTPUT=("${OUTPUT[@]}" 31) ;;
    6)  OUTPUT=("${OUTPUT[@]}" 32) ;;
    7)  OUTPUT=("${OUTPUT[@]}" 33) ;;
    8)  OUTPUT=("${OUTPUT[@]}" 34) ;;
    9)  OUTPUT=("${OUTPUT[@]}" 35) ;;
    '.') OUTPUT=("${OUTPUT[@]}" 36) ;;
    ',') OUTPUT=("${OUTPUT[@]}" 37) ;;
    '?') OUTPUT=("${OUTPUT[@]}" 38) ;;
    esac
    INPUT="${INPUT:1}"
done

echo "Numeric: ${OUTPUT[@]}"

for C in "${OUTPUT[@]}" ; do
    CIPHER="${CIPHER}${OUTSET:C:1}"
done

echo "Cipher: ${CIPHER}"


Juako 10-29-2011 11:26 PM

Recursive version, to test it run it and type away, or send it input via a pipe:

Code:

me@localhost:~$echo "apple" | thescript
Code:

#!/bin/bash
intern="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.,?"

match_idxs() {
    # only do something if we have a non-empty argument
    [[ "$1" ]] && {
        # use parameter substitution (substring expansion) to cut the first character and all the rest of our argument
        # save the resulting strings in $first and $rest
        local first=${1:0:1} rest=${1:1}

        # use parameter substitution (remove matching suffix pattern) to get a copy of $intern truncated from the
        # point where it matches "$first followed by anything"
        local match="${intern%%$first*}"

        # use parameter substitution (parameter length) to echo the length of $match (its length will equal the index of $first in $intern).
        # Add the rest of the indexes to the echo, using command substitution to call ourselves over $rest.
        # $FUNCNAME is set by bash to whatever the current function is named (could have used match_idxs with the same result, but this way is more flexible).
        echo "${#match} $($FUNCNAME $rest)"
    }
}

# Loop over our standard input, line by line until EOF. On each iteration, $line will hold current line.
# On each iteration, print indexes for every character of this line.
# We call the match_idxs function passing to it the line converted to all caps (parameter expansion - case modification)
while read line; do
    match_idxs "${line^^}"
done

Note on how match_idxs completes and ends:

Eventually, an instance of match_idx will obtain a $first containing the last character of $line and $rest will be empty. It will echo the position and make a final call to itself with an empty argument.

The instance of match_idxs called with an empty argument will return without calling itself again, thus provoking the return of all previous instances of match_idxs in "chain reaction", up to the very first. At that point $line will be processed entirely and the original instance of match_idxs will return to the main "while" loop.

catkin 10-29-2011 11:44 PM

Ingeniously neat, Juako :)

grail 10-30-2011 03:04 AM

Not sure we had to use extra functions:
Code:

#!/bin/bash

intern="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.,?"
line=${1^^}

for (( i = 0; i < ${#line}; i++ ))
do
    match=${intern%%${line:i:1}*}
    echo -n "${#match} "
done

echo


David the H. 10-30-2011 06:41 AM

A slightly different solution, using an associative array.

Code:

#!/bin/bash

# populate the associative array, with each element holding the next sequential integer

declare -A index
n=0

for i in {A..Z} {0..9} {.,\,,?}; do
    index[$i]=$(( n++ ))        #must use "$i" here, as it outputs a string.
done

# break the input string into individual characters, and add to a regular array

for (( i = 0; i < ${#1}; i++ )) ; do
    input[i]=${1:$i:1}
done

# For each element in the input array, add its corresponding associative array value to the output array.
# ${var^^} forces the values to uppercase first.

for i in "${input[@]^^}"; do
    output+=( "${index[$i]}" )
done

# print the output array.  ${arr[*]} allows the use of IFS to define the delimiting character.
# e.g. set IFS=$'\n' first to output one entry per line.

echo "${output[*]}"

exit 0

It's a bit longer, but I think principle behind it is more flexible.

Juako 10-30-2011 10:52 AM

Quote:

Originally Posted by grail (Post 4511662)
Not sure we had to use extra functions:

Of course you don't *have* to, unless you want to use recursion AND processing stdin in the same script :)

In all fairness, regardless of how recursion provides (IMO) for more compact and beautiful code, I have to say that (in Bash) it is more expensive than iteration, mostly because Bash lacks tail call optimization, which means overhead and chances of stack overflows if you are processing a too large input (ie: you'll have to spawn an entire bash process PER CHARACTER you are examining!!).

So, I have to recommend a loop unless you know you want to afford the extra cost. In that sense grail's snippet is preferable, definitely.

rewtnull 10-30-2011 04:02 PM

Thank you very much for all your examples, guys! They all achive exactly what I was looking for. I'll study your code and see how I can get some inspiration from the ideas provided here. Extra kudos to you, David the H., for the associative array example and for commenting your code properly, very helpful. :)

David the H. 11-01-2011 08:26 AM

Glad to help. It's a good idea to always clearly comment your code.

By the way, if it's possible for the input string to contain characters not defined in the assoc. array, you can tell it to output a default value with the pattern "${var:-default}".

To have it insert an asterisk, for example, for each unsupported value, simply modify the output loop thus:
Code:

for i in "${input[@]^^}"; do
    output+=( "${index[$i]:-*}" )
done


Juako 11-01-2011 02:53 PM

@rewtnull & David the H.

I acknowledge your remarks on commenting code, after all this is a site for learning. Won't be of much help if I post code (no matter what length) and assume everyone will read it like it's english. Sorry, I'll take it into account for future posts, and add the comments to my entry.


All times are GMT -5. The time now is 08:49 AM.