LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 11-02-2011, 07:58 AM   #1
nano2
Member
 
Registered: May 2007
Posts: 100

Rep: Reputation: 15
ordering an array based on another array


I have the following scenario where by i have array1 with elements in order of preference and array2 with unsorted arrays that contains some of array1 elements.

Code:
array1=( dog cat lion cow)
array2=(lion rat mouse  dog zebra pet cat bird cow calf sheep ........... )
I want to be able to use the contents of array2 but I need to have to the first few elements ordered.

My desire result would be
dog cat lion cow rat mouse zebra pet .. ... ....

What i have
Code:
cnt=${#array1[@]}
for ((i = 0; i < cnt ; i++ )) # array1.
do
  echo ${array1[i]}
done

Last edited by nano2; 11-02-2011 at 08:39 AM.
 
Old 11-02-2011, 09:59 AM   #2
Juako
Member
 
Registered: Mar 2010
Posts: 202

Rep: Reputation: 84
I thought at first of making a new array, but seems feasible to directly manipulate array2:

Code:
#!/bin/bash
array1=(dog cat lion cow)
array2=(lion rat mouse dog zebra pet cat bird cow calf sheep)

# swaps two elements of an array
# $1 = array, $2 = index, $3 = index
swap() {
    local ref1=$1[$2] ref2=$1[$3]
    local temp=${!ref1} 
    eval "$ref1=\"${!ref2}\" $ref2=\"$temp\""
} 

echo ${array1[@]}
echo ${array2[@]}

for (( i = 0, swap_idx = 0; i < ${#array2[@]}; i++ )); do
    [[ "${array1[@]//${array2[$i]}/}" != "${array1[@]}"  ]] && swap array2 $i $(( swap_idx++ ))
done

echo ${array2[@]}
The for loop:
The loop initializes $i and $swap_idx to 0, then iterates each member of array2 indexing it with $i, using parameter expansion (pattern substitution) to see if array1 contains the current examined element of array2 (the pattern in the left of the != removes the element of array1 by replacing it with "" if it matches, so the resulting array won't equal the original array1).

If the element is matched by the pattern, we swap two elements of array2: the current element, with the one indexed by swap_idx, which is at "the last position where an already sorted element has been inserted + 1" (it starts being zero, check the declare statement before the loops).

The swap() function:
This resorts to some trickery to do "indirect reference" to the arrays and elements based on its arguments: first we define two local variables containing the literal strings of some array[index] based on $1, $2 and $3.

Then we can use the ${!String} parameter substitution (names matching prefix), which expands to the real value of a variable named after String. We use that to save one of the elements in $temp, and then we make the swap. Why do I use eval? Well, I couldn't make ${!} work in the left side of the assignment, so I resorted to the literal values and made ALL THE SWAP OPERATION a string, then eval'ed it (had to use some escaping on the right side of the assignments).

edit: note that this will alter the order of elements in array2 that aren't in array1. If you want to keep those in order, you should change the for loops for these:
Code:
for (( i = 0, swap_idx = 0; i < ${#array2[@]}; i++ )); do
    [[ "${array1[@]//${array2[$i]}/}" != "${array1[@]}"  ]] && {
        for (( j = i; j > $swap_idx; j-- )); do
            swap array2 $j $j-1
        done
        (( swap_idx++ ))
    }
done
This is the same as before, but before incrementing swap_idx it "slides" the matching element towards swap_idx position, swapping it with the previous element until it's in the correct place. That way the order of non matching elements is preserved.

Let me know if something is confusing, english isn't my native language and sometimes I'm kludgy at it :/

Last edited by Juako; 11-02-2011 at 12:02 PM. Reason: changed loop level for param subst - substr replacement
 
Old 11-02-2011, 10:15 AM   #3
berbae
Member
 
Registered: Jul 2005
Location: France
Distribution: Arch Linux
Posts: 540

Rep: Reputation: Disabled
My solution:
Code:
#!/bin/bash
array1=(dog cat lion cow)
array2=(lion rat mouse dog zebra pet cat bird cow calf sheep)
list2="${array2[*]}"
cnt=${#array1[@]}
for ((i=0; i <cnt ; i++ )); do
  pref=${array1[i]}
  if grep -wq $pref <<< "$list2"; then
      preflist2+="$pref "
      list2="${list2/$pref }"
  fi
done
array3=($preflist2$list2)
echo array1:
echo "${array1[*]}"
echo array2:
echo "${array2[*]}"
echo array3:
echo "${array3[*]}"
I supposed there are no double entries.

Last edited by berbae; 11-03-2011 at 09:07 AM.
 
Old 11-02-2011, 11:40 AM   #4
nano2
Member
 
Registered: May 2007
Posts: 100

Original Poster
Rep: Reputation: 15
Thanks that works however i can't access the array3 elements one by

Code:
fn "${array3[@]}"
fn(){
shift
arr=( "$@" )
cnt=${#arr[@]}
echo "the count is " $cnt
for (( idx = 0 ; idx < cnt ; idx++ ))
do
  echo ${arr[idx]}
done

}
why is i can only see one element in the array3 ?
 
Old 11-02-2011, 01:21 PM   #5
Juako
Member
 
Registered: Mar 2010
Posts: 202

Rep: Reputation: 84
edit:
I've changed this version to avoid using eval. A few posts later in this thread you can find links with information on this command, many containing explanations of why it almost always pays to invest some more time and rethink code using this command.

This one rebuilds the sorted array every time there's a match (this is probably faster than the script I've posted earlier in this thread, that uses eval too).

Code:
#!/bin/bash
declare -a sortby=( $1 ) sorted=( $2 )
declare length=${#sorted[@]} i=${#sortby} matches

while (( i-- )); do
    [[ ! "${sorted[@]}" =~ ${sortby[$i]} ]] && continue
    sorted=( ${sorted[@]//$BASH_REMATCH/} )
    printf -v matches '%*s' $(( $length - ${#sorted[@]} ))
    sorted=( ${matches// /$BASH_REMATCH }${sorted[@]} )
done

echo ${sorted[@]}
me@localhost:~$ ./script "dog cat lion cow" "lion rat mouse dog zebra lion pet cat bird cow calf lion"
dog cat lion lion lion cow rat mouse zebra pet bird calf


Explanation:

1) The declare lines save the first series of words in the "sorter" array named $sortby, and the second series in $sorted, which is the array to be sorted. The variable $i is initialized to (length of $sortby - 1) (the last element -- array indices start with 0), and $length stores the length of $sorted. This is needed in the loop when $sorted is being rebuilt.

2) The loop uses $i to index $sortby from the end (as a for..in would return the ordered matches reversed). On each round, if $sorted doesn't contain the current inspected element of $sortby, just continues looping.

If there was a match, the continue won't be executed and in turn the next line in the loop executes. Note that the test '=~', in case the pattern on the right does match, stores that pattern in the shell variable $BASH_REMATCH (actually it is an array and the behaviour is a bit more flexible, but to our goal this is all we need to know).

3) The next line then, rebuilds $sorted removing all matches of the inspected element. So now we have to add back to the top of $sorted as many matches of that element as were found. This is done in two steps:

4) The printf line uses "-v matches" to print its output to the variable $matches, instead to standard output. The format string '%*s' is a string used by printf to format and make conversions on the data it will print. In this case '%' means "start reading instructions for conversions", '*' means "the first parameter will tell you the minimum width for printing the second parameter" and 's' means "the second parameter (what you have to print) is a string". The trick is to pass then ONLY the minimum-width parameter after this, and a null string-parameter (the arithmetic substitution that follows, which equals the number of matches of the inspected element, is thus the minimum width we require). Passing a null string (ie, not passing a second parameter) leaves only the spaces that printf uses to pad a string to the requested minimum width.

5) So the printf line stores in $matches a string of x spaces, where x is the number of matches of the inspected element. Now the $sorted array is rebuilt, adding to its front the replacement of each space in $matches by the inspected element, followed by $sorted itself, which already had this element removed. The result is that matched elements will successively "stack" at the front of $sorted, as they are deleted from the rest of it.

Since the "stacking at the front" necessarily puts newer matches before previous matches, we can't traverse $sortby from the first element to the last, so that's why $i is decremented from the last element of $sortby to the first.

Last edited by Juako; 11-15-2011 at 06:04 PM. Reason: refactored for evallessness
 
Old 11-03-2011, 09:04 AM   #6
berbae
Member
 
Registered: Jul 2005
Location: France
Distribution: Arch Linux
Posts: 540

Rep: Reputation: Disabled
Quote:
Originally Posted by nano2 View Post
why is i can only see one element in the array3 ?
your code for the fn function should be:
Code:
fn(){
arr=( $@ )
cnt=${#arr[@]}
echo "the count is " $cnt
for (( idx = 0 ; idx < cnt ; idx++ ))
do
  echo ${arr[idx]}
done
}
fn "${array3[@]}"
with

arr=( "$@" )

there will be only one element in the array arr.

with

arr=( $@ )

each word in $@ is a separate element.

I made the same error in my solution above, so I changed the line:

array3=("$preflist2$list2")

to

array3=($preflist2$list2)

to have separate words.

Last edited by berbae; 11-03-2011 at 09:13 AM.
 
Old 11-13-2011, 06:31 AM   #7
nano2
Member
 
Registered: May 2007
Posts: 100

Original Poster
Rep: Reputation: 15
Hi ,

Using the below solution
Code:
#!/bin/bash
fn2 ()
{
shift

array2=( $@ )
cnt=${#arr[@]}
echo "the count is " $cnt
for (( idx = 0 ; idx < cnt ; idx++ ))
do
  echo ${array2[idx]}
done

array1=(dog cat lion cow)
#######array2=(lion rat mouse dog zebra pet cat bird cow calf sheep)
list2="${array2[*]}"
cnt=${#array1[@]}
for ((i=0; i <cnt ; i++ )); do
  pref=${array1[i]}
  if grep -wq $pref <<< "$list2"; then
      preflist2+="$pref "
      list2="${list2/$pref }"
  fi
done
array3=($preflist2$list2)
echo array1:
echo "${array1[*]}"
echo array2:
echo "${array2[*]}"
echo array3:
echo "${array3[*]}"

}

fn2 "${array2[@]}"
array2=(lion rat mouse dog zebra pet cat bird cow calf sheep)

I am trying to pass array2 into the above fn2 but i am only getting one element in the array ...

anyone know whats going on ?

Thanks
 
Old 11-13-2011, 08:03 AM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946
Quote:
Originally Posted by berbae View Post
your code for the fn function should be:

with

arr=( "$@" )

there will be only one element in the array arr.

with

arr=( $@ )

each word in $@ is a separate element.
This is not true at all. $@ and ${array[@]} are designed so that when quoted, each element is treated as a separate quoted string.

Quoting them is generally a good idea, to avoid any word-splitting on spaces inside each element.

$* and ${array[*]} are the patterns that output the whole list as a single string, and will have the effect you mention if quoted.

http://mywiki.wooledge.org/BashGuide...Using_Arrays-1
 
Old 11-13-2011, 08:39 AM   #9
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946
From reading the OP (but not looking deeply at the rest), I believe this does what you're asking for:

Code:
#!/bin/bash

listitems() {

	#declares (local to the function) an associative array, and two regular output arrays
	local -A defarray
	local -a prearray mainarray

	#set each default element with the number it should be ordered in
	defarray=( [dog]=0 [cat]=1 [lion]=2 [cow]=3 )

	#loop through the function input
	#if an entry matches a default entry, then set into a prefix array
	#with that element number.  Else, add it to the main array

	for entry in "$@"; do

		if [[ "${defarray[$entry]}" ]]; then
			prefarray[${defarray[$entry]}]="$entry"
		else
			mainarray+=( "$entry" )
		fi

	done

	#echo the prefix array, followed by the main array.
	echo "List: ${prefarray[@]} ${mainarray[@]}"

}

# here is the input list, in yet another array
array2=( lion rat mouse dog zebra pet cat bird cow calf sheep )

# call the function with the input list
listitems "${array2[@]}"
It stores the default list in an associative array, making it easier to match and order individual elements.

***
Edit: By the way, one drawback of the above is that if there are multiple input entries that match a default, it will be output only once. But non-default entries will print multiple times. e.g.

input: sheep cow dog zebra dog zebra
output: dog cow sheep zebra zebra

If this is a problem, tell us what the true output needs to be and we'll try to work that out.
***

And nano2, if I can suggest a tip, try for a more effective use of formatting and whitespace in your script. Indent your loops and tests, line up sections of the same depth, and separate blocks of code with empty lines. A few comments would be useful too. It helps make things readable and debuggable.

http://mywiki.wooledge.org/BashGuide...es#Readability

Last edited by David the H.; 11-13-2011 at 09:00 AM. Reason: stated
 
Old 11-13-2011, 04:13 PM   #10
Juako
Member
 
Registered: Mar 2010
Posts: 202

Rep: Reputation: 84
@David the H. :

I believe the one I had posted here does the trick. Have you checked it?
 
Old 11-14-2011, 11:51 AM   #11
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946
As I said, I hadn't taken the time to closely inspect the previous posts. But yeah, looking at it now, it is indeed a pretty ingenious solution. I probably wouldn't have thought about removing elements myself. I guess I'm just too linear a thinker.

I'm not at all happy with the idea of using eval though. How about this instead?

Code:
#!/bin/bash

array1=( dog cat lion cow )
array2=( "$@" )
length=${#array2[@]}

for (( i = ${#array1[@]} ; i >= 0; i-- )); do

     array2=( ${array2[@]//${array1[$i]}/} )
     (( matches = length - ${#array2[@]} ))

     for (( n=1 ; n <= matches ; n++ )); do
          array3+=( "${array1[i]}" )
     done

     array2=( "${array3[@]}" "${array2[@]}" )
     unset array3

done

echo "${array2[@]}"

exit 0
I've also modified my previous code to give the same output:

Code:
#!/bin/bash

listitems() {

     local -A defarray sortarray
     local -a prefarray mainarray

     defarray=( [dog]=100 [cat]=200 [lion]=300 [cow]=400 [sheep]=500 )

     for entry in "$@"; do

          if [[ "${defarray[$entry]}" ]]; then

               (( defarray[$entry]++ ))
               prefarray[${defarray[$entry]}]="$entry"

          else
               if [[ "${mainarray[*]}" =~ "$entry" ]]; then
                    (( sortarray[$entry]++ ))
               else
                    (( sortarray[$entry] = n+=100 ))
               fi
               mainarray[${sortarray[$entry]}]="$entry"
          fi

     done

     echo "${prefarray[@]} ${mainarray[@]}"

}

array2=( "$@" )

listitems "${array2[@]}"

exit 0
Each default entry is now separated by a big chunk of digits, to allow plenty of room to increment the index numbers for duplicates. And the mainarray now has its own assoc. array for tracking duplicates too, working in a similar manner.
 
1 members found this post helpful.
Old 11-14-2011, 05:09 PM   #12
Juako
Member
 
Registered: Mar 2010
Posts: 202

Rep: Reputation: 84
Thumbs up

Quote:
Originally Posted by David the H. View Post
I'm not at all happy with the idea of using eval though. How about this instead?
I should perhaps go to eval-rehab , btw: my first version uses eval to implement a swap() array elements function, which allows for more fun like in-place sorting and "sliding" array elements. But yeah, even though I hate to admit, eval is hard to trust in or know in advance about its outcomes. I have a feeling though, that my take 2 can get the effect of the eval with some substitution, which should keep avoiding a third array. I'll play a bit to find out.
 
Old 11-15-2011, 02:31 AM   #13
nano2
Member
 
Registered: May 2007
Posts: 100

Original Poster
Rep: Reputation: 15
Post

David H.

I am running bash -versionGNU bash, version 3.2.25 doesn't support declaration arrays !

Code:
     
local -A defarray sortarray

local -a prefarray mainarray
 
Old 11-15-2011, 11:22 AM   #14
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946Reputation: 1946
Sorry. Associative arrays were added in version 4. So it's the -A option throwing the error, and my solution won't work for you.

http://wiki.bash-hackers.org/scripting/bashchanges

4.0 was released almost 3 years ago. Now it's at 4.2 and even more new goodies have been added. Perhaps it's time to consider upgrading.

Juako's code should run ok on version 3. With my alterations of course.


@Juako: I look forward to seeing what you can come up with. You always post interesting solutions.

Actually, I'm not really quite as fanatical as I seem about eval. I just dislike seeing such things used unnecessarily, particularly if there are security issues. In the right hands and the right circumstances they're ok. But tools like this that have risks attached should not be treated lightly; rather you should try to exhaust your other options first.

Finally, It dawned on me that my script could be simplified quite a bit. The default list can act as the tracking list for everything, so we really only need two arrays (one still needs to be associative, though).

Code:
#!/bin/bash

listitems() {

	local -A sortarray
	local -a mainarray

	sortarray=( [dog]=100 [cat]=200 [lion]=300 [cow]=400 )
	n=400

	for entry in "$@"; do

  		if [[ "${sortarray[$entry]}" ]]; then
			(( sortarray[$entry]++ ))
		else
			(( sortarray[$entry] = n+=100 ))
		fi

		mainarray[${sortarray[$entry]}]="$entry"

	done

	echo "${mainarray[@]}"

}

array2=( "$@" )

listitems "${array2[@]}"

exit 0
 
Old 11-15-2011, 05:58 PM   #15
Juako
Member
 
Registered: Mar 2010
Posts: 202

Rep: Reputation: 84
Quote:
Originally Posted by David the H. View Post
Juako's code should run ok on version 3. With my alterations of course.

@Juako: I look forward to seeing what you can come up with. You always post interesting solutions.
An honour

Allright. I've updated the post to avoid eval, hope I retained the functionality, specifically I didn't test it with arrays that contain elements with spaces, but in case that's required, a new revision can be made.

I'd like to add a few links on eval, FWIW:

http://en.wikipedia.org/wiki/Eval
http://mywiki.wooledge.org/BashFAQ/048
http://wiki.bash-hackers.org/rcwatson
http://wiki.bash-hackers.org/scripting/style#eval
Chapter on various forms of subshelling in "Beginning Portable Shell" (seems a nice book, haven't heard of it).

(I promess to read them, hehehe)
 
  


Reply

Tags
bash scripting


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Bash: Calculating in array and storing result in array ghantauke Linux - Newbie 6 12-02-2010 12:28 PM
Server array (mysql, php) -> gboolean array in c client kalleanka Programming 1 07-27-2010 06:50 AM
[SOLVED] shell script help: copying directory list into an array and then accessing the array richman1234 Linux - Newbie 6 07-25-2010 11:19 PM
[bash] indirect array reference to array with values containing spaces Meson Linux - Software 9 06-04-2010 09:38 PM
bash: use file as input into array, parse out other variables from array using awk beeblequix Linux - General 2 11-20-2009 10:07 AM


All times are GMT -5. The time now is 12:42 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration