[SOLVED] How to convert the rows of data into the columns?

w1k0 · 09-17-2012, 01:25 AM

In the shell script I have some data stored in a few variables:

VAR1="ABCD"
VAR2="EFGH"
VAR3="IJKL"

These variables displayed with the command:

echo -e "$VAR1\n$VAR2\n$VAR3"

give the result:

Code:

ABCD
EFGH
IJKL

I’d like to display the above strings as follows:

Code:

AEI
BFJ
CGK
DHL

So I’d like to change the rows of the characters into the columns.

It’s relatively easy to convert such a file:

Code:

A B C D
E F G H
I J K L

into the file:

Code:

AEI
BFJ
CGK
DHL

using the following script:

Code:

    awk '{
          if (max_nf < NF)
               max_nf = NF
          max_nr = NR
          for (x = 1; x <= NF; x++)
               vector[x, NR] = $x
     }
     END {
          for (x = 1; x <= max_nf; x++) {
               for (y = 1; y <= max_nr; y++)
                    printf("%s", vector[x, y])
               printf("\n")
          }
     }'

Unfortunately the above solution assumes that the data is stored in a file.

I don’t want to use the temporary file because the data stored in the mentioned variables changes frequently and the continuous writing and reading of the temporary file increases the CPU usage.

I believe it’s easy to achieve that result with awk. Unfortunately I don’t know it enough to implement such a function in the shell script. Any assistance will be welcomed.

H_TeXMeX_H · 09-17-2012, 02:21 AM

Well, why not pipe the data as you display it to the awk command ?

Code:

bash-4.1$  echo -e "$VAR1\n$VAR2\n$VAR3" | ./script
AEI
BFJ
CGK
DHL
bash-4.1$ cat script
#!/bin/sh
awk '
	BEGIN {
		FS=""
	}
	{
          if (max_nf < NF)
               max_nf = NF
          max_nr = NR
          for (x = 1; x <= NF; x++)
               vector[x, NR] = $x
     }
     END {
          for (x = 1; x <= max_nf; x++) {
               for (y = 1; y <= max_nr; y++)
                    printf("%s", vector[x, y])
               printf("\n")
          }
     }'
exit 0

Note that I changed the FS="" so that each character is taken as a field.

David the H. · 09-17-2012, 06:32 AM

How about this?

Code:

for (( i=0 ; i<${#VAR1} ; i++ )); do

	printf '%s%s%s\n' "${VAR1:i:1}" "${VAR2:i:1}" "${VAR3:i:1}"

done

danielbmartin · 09-17-2012, 07:01 AM

Quote:

Originally Posted by w1k0

Have:

Code:

VAR1="ABCD"
VAR2="EFGH"
VAR3="IJKL"

Want:

Code:

AEI
BFJ
CGK
DHL

Consider this:

Code:

echo -e "$VAR1\n$VAR2\n$VAR3"  \
|awk -F "" '{for (i=1; i<=NF; i++) a[i]=a[i]$i} 
  END {i=1; while (i in a) {print a[i]; i++;}}

Daniel B. Martin

w1k0 · 09-17-2012, 02:50 PM

@H_TeXMeX_H,

Thank you for that tip. As I mentioned I don’t know awk. I found the above script in Internet. After the change you made it’s possible to pass the variables through the script without using the temporary file. It’s great improvement.

David the H. and danielbmartin,

I like your both solutions very much. The method by David the H. using the printf command assumes one have to define all the variables in the command line. It could be tiresome when there’s a lot of such a variables. Fortunately I need to convert just six variables. The method by danielbmartin using awk is more universal and very elegant in the comparison to the script that I found in Internet.

The following script:

Code:

#!/bin/bash

VAR1="0111"
VAR2="1100"
VAR3="1010"

rotate1() {
    for (( i=0 ; i<${#VAR1} ; i++ )); do
        printf '%s%s%s\n' "${VAR1:i:1}" "${VAR2:i:1}" "${VAR3:i:1}"
    done
}

rotate2() {
    awk -F "" '{for (i=1; i<=NF; i++) a[i]=a[i]$i}
        END {i=1; while (i in a) {print a[i]; i++;}}'
}

echo
echo -e "$VAR1\n$VAR2\n$VAR3"

echo
echo -e "$VAR1\n$VAR2\n$VAR3" | rotate1

echo
echo -e "$VAR1\n$VAR2\n$VAR3" | rotate2

produces the expected results:

Code:

I decided to compare both these methods so I prepared two scripts – the first uses rotate1 function (printf) and the other uses rotate2 function (awk). Both these scripts perform 25,000 iterations and both of them send the output to /dev/null in order to test the defined functions performance only.

The script using rotate1 function (printf) worked significantly shorter:

Code:

real    1m5.928s
user    0m38.734s
sys     0m42.828s

and used from 30.4% to 47.7% of CPU power (the median was equal 45.5%).

The script using rotate2 function (awk) worked significantly longer:

Code:

real    3m27.683s
user    2m14.988s
sys     1m11.728s

and used from 36.7% to 43.4% of CPU power (the median was equal 41.0%).

From my purposes point of view the above differences aren’t crucial because my final script will run once a second.

***

Now I have the next question.

The output of the script is:

Code:

but I need to present the data in the other way. Instead of “0” I need “A” and instead of “1” I need “B”.

Now I pass the string of the variables through the rotate function and then pass the result through sed:

echo -e "$VAR1\n$VAR2\n$VAR3" | rotate_1_or_2 | sed -E 's/0/A/g;s/1/B/g'

I believe it’s possible to add to rotate1 or rotate2 functions the procedure that performs such a substitution. I’ll be grateful for some advice.

danielbmartin · 09-17-2012, 04:19 PM

Quote:

Originally Posted by w1k0

Now I have the next question.

The output of the script is:

Code:

but I need to present the data in the other way. Instead of “0” I need “A” and instead of “1” I need “B”.

Now I pass the string of the variables through the rotate function and then pass the result through sed:

echo -e "$VAR1\n$VAR2\n$VAR3" | rotate_1_or_2 | sed -E 's/0/A/g;s/1/B/g'

I believe it’s possible to add to rotate1 or rotate2 functions the procedure that performs such a substitution. I’ll be grateful for some advice.

Words are good; words with examples are better. Please post "Have" and "Want" examples, similar to what I put in a previous post.

Daniel B. Martin

w1k0 · 09-17-2012, 05:00 PM

Here’s my script (binary clock):

Code:

#!/bin/bash

H=`date +"%H"`
M=`date +"%M"`
S=`date +"%S"`

Hl=`echo $H | sed -E 's/(.)./\1/'`
Hr=`echo $H | sed -E 's/.(.)/\1/'`
Ml=`echo $M | sed -E 's/(.)./\1/'`
Mr=`echo $M | sed -E 's/.(.)/\1/'`
Sl=`echo $S | sed -E 's/(.)./\1/'`
Sr=`echo $S | sed -E 's/.(.)/\1/'`

dectobin() {
    echo $1 | perl -e "printf(\"%04b\n\", <STDIN>)"
}

rotate1() {
    for (( i=0 ; i<${#Hl} ; i++ )); do
        printf '%s%s%s%s%s%s\n' "${Hl:i:1}" "${Hr:i:1}" "${Ml:i:1}" "${Mr:i:1}" "${Sl:i:1}" "${Sr:i:1}"
    done
}

rotate2() {
    awk -F "" '{for (i=1; i<=NF; i++) a[i]=a[i]$i}
        END {i=1; while (i in a) {print a[i]; i++;}}'
}

Hl=`dectobin $Hl`
Hr=`dectobin $Hr`
Ml=`dectobin $Ml`
Mr=`dectobin $Mr`
Sl=`dectobin $Sl`
Sr=`dectobin $Sr`

echo "HH MM SS"

echo -e "$Hl\n$Hr\n$Ml\n$Mr\n$Sl\n$Sr" | rotate1 | sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'

The above script gets the hours (H), minutes (M), and seconds (S). Splits them into left and right values (Hl, Hr, Ml, Mr, Sl, and Sr). Converts these decimal numbers into binary ones using dectobin function. Rotates the rows into columns using rotate1 or rotate2 function. Finally it filters the oputput through sed.

The data after rotating the rows into the columns but before filtering through sed looks like:

Code:

The same data filtered through sed and with the label looks like:

Code:

HH MM SS
  | *| *|
  |  |  |
**|* |  |
 *|**|**|

It’s easy to add the vertical lines to the output modifying printf command from:

Code:

printf '%s%s%s%s%s%s\n'

to:

Code:

printf '%s%s|%s%s|%s%s|\n'

The harder part is to convert all “0s” to spaces and all “1s” to asterisks. As you can see now I filter the output through the following command:

Code:

sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'

But I wonder if it’s possible to perform the above substitutions using printf from rotate1 function or awk from rotate2 function in order to avoid the additional filtering through sed.

PTrenholme · 09-17-2012, 11:22 PM

If you pass some arguments to you rotate function, it might handle more general cases.

Here's a slight re-write of the code you posted:

Code:

#!/bin/bash

H=$(date +%H)
M=$(date +%M)
S=$(date +%S)

Hl=$(echo $H | sed -E 's/(.)./\1/')
Hr=$(echo $H | sed -E 's/.(.)/\1/')
Ml=$(echo $M | sed -E 's/(.)./\1/')
Mr=$(echo $M | sed -E 's/.(.)/\1/')
Sl=$(echo $S | sed -E 's/(.)./\1/')
Sr=$(echo $S | sed -E 's/.(.)/\1/')

dectobin() {
  printf -v $1 "%s" $(echo "${!1}" | perl -e "printf(\"%04b\n\", <STDIN>)")
}

rotate1()
{
  local i j max
  max=0
  for i in $*
  do
    [ ${max} -lt ${#i} ] && max=${#i}
  done
  for (( i=0 ; i<${max} ; i++ ))
  do
    for j in $*
    do
      printf '%s' "${j:${i}:1}"
      done
    printf '\n'
  done
}
      
rotate2() {
  awk -F "" '{for (i=1; i<=NF; i++) a[i]=a[i]$i}END {i=1; while (i in a) {print a[i]; i++;}}'
}

dectobin Hl
dectobin Hr
dectobin Ml
dectobin Mr
dectobin Sl
dectobin Sr

echo "HH MM SS"

rotate1 $Hl $Hr $Ml $Mr $Sl $Sr | sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'

which produces this:

Code:

$ sh rotate.sh 
HH MM SS
  |  |  |
  | *|  |
* | *| *|
 *|**|  |

H_TeXMeX_H · 09-18-2012, 02:01 AM

Interesting script.

w1k0 · 09-18-2012, 12:03 PM

PTrenholme,

I compared your rotate1 function to the rotate1 function suggested by David the H.

Benchmark for David the H. function:

Code:

#!/bin/bash

VAR1="0111"
VAR2="1100"
VAR3="1010"

rotate1() {
    for (( i=0 ; i<${#VAR1} ; i++ )); do
        printf '%s%s%s\n' "${VAR1:i:1}" "${VAR2:i:1}" "${VAR3:i:1}"
    done
}

time for n in `seq 1 25000`
do
    echo -e "$VAR1\n$VAR2\n$VAR3" | rotate1 > /dev/null
done

Benchmark for PTrenholme function:

Code:

#!/bin/bash

VAR1="0111"
VAR2="1100"
VAR3="1010"

rotate1()
{
  local i j max
  max=0
  for i in $*
  do
    [ ${max} -lt ${#i} ] && max=${#i}
  done
  for (( i=0 ; i<${max} ; i++ ))
  do
    for j in $*
    do
      printf '%s' "${j:${i}:1}"
      done
    printf '\n'
  done
}

time for n in `seq 1 25000`
do
    rotate1 $VAR1 $VAR2 $VAR3 > /dev/null
done

The previous benchmarks mentioned in the post #5 I made after closing all applications and Internet connection because I wanted to compare not only times of the execution but also the CPU usages. The present benchmarks I made on the system running different applications and using Internet connection testing the times of the execution only. That explains the difference between the previous and the present results for David the H. function. Previously, on the idle system, the real time was 1m5.928s – at present, on the busy system, the real time is 1m55.536s.

The result for David the H. function:

Code:

real    1m55.536s
user    0m38.895s
sys     0m46.289s

The result for PTrenholme function:

Code:

real    0m28.595s
user    0m21.478s
sys     0m0.655s

So your rotate1 function works four times faster than David the H. function. It’s incredible improvement.

Of course your function is also more general so one doesn’t have to define in that function all the variables.

I’m really impressed by your work.

H_TeXMeX_H,

Quote:

Originally Posted by H_TeXMeX_H

Interesting script.

I’m not sure it’s a serious opinion or an ironic one.

I put together in my script the shell commands, sed, Perl, and AWK. My script is a melting pot of different programming techniques. I’m aware it’s possible to get the same result in a more elegant form using Perl or AWK only. Unfortunately I don’t know neither of them enough to code that.

So if it was irony it was justified.

***

My question from post #7 is still open: is it possible to avoid the substitution performed with sed:

Code:

sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'

by implementing such a substitution into rotate1 function in order to change the rotated data such as:

Code:

into:

Code:

  | *|  |
  |* | *|
**|  | *|
 *|**|**|

(Both above outputs show the time 23:59:17.)

w1k0 · 09-18-2012, 12:20 PM

@H_TeXMeX_H,

I just realized you found my post including the binary clock script helpful. So your statement wasn't an irony.

H_TeXMeX_H · 09-19-2012, 03:10 AM

Well, I mean it is interesting in terms of being a binary clock written in bash, and PTrenholme's frankenstein script written in many different languages, and yet has good performance. I meant no irony.

w1k0 · 09-19-2012, 09:46 AM

I prepared binary clock among the other scripts for the next release of wminfo (see: http://dockapps.windowmaker.org, or http://freecode.com, or http://slackbuilds.org). The wminfo program is a dockable application for Window Maker that displays different information using the plugins. The binary clock is such a plugin. The performance of the plugins becomes crucial when you run a dozen or so of instances of wminfo using different plugins. One poorly designed plugin isn’t a problem. Twelve poorly designed plugins can consume a lot of system resources. So I spend a lot of time testing and optimizing the plugins or writing the alternative versions of the most useful and the most demanding plugins.

grail · 09-19-2012, 10:50 AM

Not sure if it had to be done in bash ... but thought it was a nice little challenge in ruby

Code:

#!/usr/bin/env ruby

cnt = 0
array = Array.new(6) { Array.new(4) }
t = Time.now.strftime("%H%M%S")

puts "HH MM SS"

t.each_char do |c|
		0.upto(3) do |n|
				array[cnt][n] = c.to_i[n]
		end

		array[cnt].reverse!
		cnt+=1
end

transpose_array = array.transpose

transpose_array.each{ |x| x.map!{ |y| y == 0 ? ' ' : '*' } }


transpose_array.each do |x|
	6.step(2, -2) { |n| x.insert(n, '|') }
	print "#{x.join}\n"
end

This actually covers the whole process.

ntubski · 09-19-2012, 11:44 AM

GNU awk (uses nonstandard bitwise and function), avoids both substitution and rotation:

Code:

#!/bin/sh
echo "HH MM SS"
date '+%H|%M|%S' | gawk -vFS= '
{for (bit = 8; bit >= 1; bit = bit/2) {
    for (i=1; i <= NF; i++)
        printf($i=="|"?$i: and($i,bit)?"*":" ");
    print("")}}'