[SOLVED] Compare array index number against element in bash
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Compare array index number against element in bash
Hi.
Consider this snippet of code in bash:
Code:
shopt -sq nocasematch
intern=( A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 . , ? )
input=($(echo "apple" | sed -e 's/[[:alnum:].,\?]/& /g' ))
count="0"
for element1 in ${input[@]}; do
for element2 in ${intern[@]}; do
if [[ ${element1} == ${element2} ]]; then
echo "index: ${count} element: ${intern[${count}]} input: ${input[${count}]}"
break
fi
done
(( count++ ))
done
What I would like to do is to somehow save the index numbers of my matches on $input (which has variable element contents), and then compare those index numbers against the index numbers of $intern. In this example, I would like 'apple' to become '0 15 15 11 4'. Why I want to match against index number is because later on I'd like to substitute the results against elements in a third array, which does not contain regular alphabetical characters.
Is this possible, and if so, how could I achieve this?
Recursive version, to test it run it and type away, or send it input via a pipe:
Code:
me@localhost:~$echo "apple" | thescript
Code:
#!/bin/bash
intern="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.,?"
match_idxs() {
# only do something if we have a non-empty argument
[[ "$1" ]] && {
# use parameter substitution (substring expansion) to cut the first character and all the rest of our argument
# save the resulting strings in $first and $rest
local first=${1:0:1} rest=${1:1}
# use parameter substitution (remove matching suffix pattern) to get a copy of $intern truncated from the
# point where it matches "$first followed by anything"
local match="${intern%%$first*}"
# use parameter substitution (parameter length) to echo the length of $match (its length will equal the index of $first in $intern).
# Add the rest of the indexes to the echo, using command substitution to call ourselves over $rest.
# $FUNCNAME is set by bash to whatever the current function is named (could have used match_idxs with the same result, but this way is more flexible).
echo "${#match} $($FUNCNAME $rest)"
}
}
# Loop over our standard input, line by line until EOF. On each iteration, $line will hold current line.
# On each iteration, print indexes for every character of this line.
# We call the match_idxs function passing to it the line converted to all caps (parameter expansion - case modification)
while read line; do
match_idxs "${line^^}"
done
Note on how match_idxs completes and ends:
Eventually, an instance of match_idx will obtain a $first containing the last character of $line and $rest will be empty. It will echo the position and make a final call to itself with an empty argument.
The instance of match_idxs called with an empty argument will return without calling itself again, thus provoking the return of all previous instances of match_idxs in "chain reaction", up to the very first. At that point $line will be processed entirely and the original instance of match_idxs will return to the main "while" loop.
Last edited by Juako; 11-01-2011 at 03:42 PM.
Reason: added comments
A slightly different solution, using an associative array.
Code:
#!/bin/bash
# populate the associative array, with each element holding the next sequential integer
declare -A index
n=0
for i in {A..Z} {0..9} {.,\,,?}; do
index[$i]=$(( n++ )) #must use "$i" here, as it outputs a string.
done
# break the input string into individual characters, and add to a regular array
for (( i = 0; i < ${#1}; i++ )) ; do
input[i]=${1:$i:1}
done
# For each element in the input array, add its corresponding associative array value to the output array.
# ${var^^} forces the values to uppercase first.
for i in "${input[@]^^}"; do
output+=( "${index[$i]}" )
done
# print the output array. ${arr[*]} allows the use of IFS to define the delimiting character.
# e.g. set IFS=$'\n' first to output one entry per line.
echo "${output[*]}"
exit 0
It's a bit longer, but I think principle behind it is more flexible.
Last edited by David the H.; 10-30-2011 at 06:44 AM.
Reason: couple of tpyos
Of course you don't *have* to, unless you want to use recursion AND processing stdin in the same script
In all fairness, regardless of how recursion provides (IMO) for more compact and beautiful code, I have to say that (in Bash) it is more expensive than iteration, mostly because Bash lacks tail call optimization, which means overhead and chances of stack overflows if you are processing a too large input (ie: you'll have to spawn an entire bash process PER CHARACTER you are examining!!).
So, I have to recommend a loop unless you know you want to afford the extra cost. In that sense grail's snippet is preferable, definitely.
Thank you very much for all your examples, guys! They all achive exactly what I was looking for. I'll study your code and see how I can get some inspiration from the ideas provided here. Extra kudos to you, David the H., for the associative array example and for commenting your code properly, very helpful.
Glad to help. It's a good idea to always clearly comment your code.
By the way, if it's possible for the input string to contain characters not defined in the assoc. array, you can tell it to output a default value with the pattern "${var:-default}".
To have it insert an asterisk, for example, for each unsupported value, simply modify the output loop thus:
Code:
for i in "${input[@]^^}"; do
output+=( "${index[$i]:-*}" )
done
I acknowledge your remarks on commenting code, after all this is a site for learning. Won't be of much help if I post code (no matter what length) and assume everyone will read it like it's english. Sorry, I'll take it into account for future posts, and add the comments to my entry.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.