[SOLVED] How to convert the rows of data into the columns?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: Slackware (personalized Window Maker), Mint (customized MATE)
Posts: 1,309
Rep:
How to convert the rows of data into the columns?
In the shell script I have some data stored in a few variables:
VAR1="ABCD"
VAR2="EFGH"
VAR3="IJKL"
These variables displayed with the command:
echo -e "$VAR1\n$VAR2\n$VAR3"
give the result:
Code:
ABCD
EFGH
IJKL
I’d like to display the above strings as follows:
Code:
AEI
BFJ
CGK
DHL
So I’d like to change the rows of the characters into the columns.
It’s relatively easy to convert such a file:
Code:
A B C D
E F G H
I J K L
into the file:
Code:
AEI
BFJ
CGK
DHL
using the following script:
Code:
awk '{
if (max_nf < NF)
max_nf = NF
max_nr = NR
for (x = 1; x <= NF; x++)
vector[x, NR] = $x
}
END {
for (x = 1; x <= max_nf; x++) {
for (y = 1; y <= max_nr; y++)
printf("%s", vector[x, y])
printf("\n")
}
}'
Unfortunately the above solution assumes that the data is stored in a file.
I don’t want to use the temporary file because the data stored in the mentioned variables changes frequently and the continuous writing and reading of the temporary file increases the CPU usage.
I believe it’s easy to achieve that result with awk. Unfortunately I don’t know it enough to implement such a function in the shell script. Any assistance will be welcomed.
Distribution: Slackware (personalized Window Maker), Mint (customized MATE)
Posts: 1,309
Original Poster
Rep:
@H_TeXMeX_H,
Thank you for that tip. As I mentioned I don’t know awk. I found the above script in Internet. After the change you made it’s possible to pass the variables through the script without using the temporary file. It’s great improvement.
David the H. and danielbmartin,
I like your both solutions very much. The method by David the H. using the printf command assumes one have to define all the variables in the command line. It could be tiresome when there’s a lot of such a variables. Fortunately I need to convert just six variables. The method by danielbmartin using awk is more universal and very elegant in the comparison to the script that I found in Internet.
The following script:
Code:
#!/bin/bash
VAR1="0111"
VAR2="1100"
VAR3="1010"
rotate1() {
for (( i=0 ; i<${#VAR1} ; i++ )); do
printf '%s%s%s\n' "${VAR1:i:1}" "${VAR2:i:1}" "${VAR3:i:1}"
done
}
rotate2() {
awk -F "" '{for (i=1; i<=NF; i++) a[i]=a[i]$i}
END {i=1; while (i in a) {print a[i]; i++;}}'
}
echo
echo -e "$VAR1\n$VAR2\n$VAR3"
echo
echo -e "$VAR1\n$VAR2\n$VAR3" | rotate1
echo
echo -e "$VAR1\n$VAR2\n$VAR3" | rotate2
produces the expected results:
Code:
0111
1100
1010
011
110
101
100
011
110
101
100
I decided to compare both these methods so I prepared two scripts – the first uses rotate1 function (printf) and the other uses rotate2 function (awk). Both these scripts perform 25,000 iterations and both of them send the output to /dev/null in order to test the defined functions performance only.
The script using rotate1 function (printf) worked significantly shorter:
Code:
real 1m5.928s
user 0m38.734s
sys 0m42.828s
and used from 30.4% to 47.7% of CPU power (the median was equal 45.5%).
The script using rotate2 function (awk) worked significantly longer:
Code:
real 3m27.683s
user 2m14.988s
sys 1m11.728s
and used from 36.7% to 43.4% of CPU power (the median was equal 41.0%).
From my purposes point of view the above differences aren’t crucial because my final script will run once a second.
***
Now I have the next question.
The output of the script is:
Code:
011
110
101
100
but I need to present the data in the other way. Instead of “0” I need “A” and instead of “1” I need “B”.
Now I pass the string of the variables through the rotate function and then pass the result through sed:
echo -e "$VAR1\n$VAR2\n$VAR3" | rotate_1_or_2 | sed -E 's/0/A/g;s/1/B/g'
I believe it’s possible to add to rotate1 or rotate2 functions the procedure that performs such a substitution. I’ll be grateful for some advice.
Distribution: Slackware (personalized Window Maker), Mint (customized MATE)
Posts: 1,309
Original Poster
Rep:
Here’s my script (binary clock):
Code:
#!/bin/bash
H=`date +"%H"`
M=`date +"%M"`
S=`date +"%S"`
Hl=`echo $H | sed -E 's/(.)./\1/'`
Hr=`echo $H | sed -E 's/.(.)/\1/'`
Ml=`echo $M | sed -E 's/(.)./\1/'`
Mr=`echo $M | sed -E 's/.(.)/\1/'`
Sl=`echo $S | sed -E 's/(.)./\1/'`
Sr=`echo $S | sed -E 's/.(.)/\1/'`
dectobin() {
echo $1 | perl -e "printf(\"%04b\n\", <STDIN>)"
}
rotate1() {
for (( i=0 ; i<${#Hl} ; i++ )); do
printf '%s%s%s%s%s%s\n' "${Hl:i:1}" "${Hr:i:1}" "${Ml:i:1}" "${Mr:i:1}" "${Sl:i:1}" "${Sr:i:1}"
done
}
rotate2() {
awk -F "" '{for (i=1; i<=NF; i++) a[i]=a[i]$i}
END {i=1; while (i in a) {print a[i]; i++;}}'
}
Hl=`dectobin $Hl`
Hr=`dectobin $Hr`
Ml=`dectobin $Ml`
Mr=`dectobin $Mr`
Sl=`dectobin $Sl`
Sr=`dectobin $Sr`
echo "HH MM SS"
echo -e "$Hl\n$Hr\n$Ml\n$Mr\n$Sl\n$Sr" | rotate1 | sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'
The above script gets the hours (H), minutes (M), and seconds (S). Splits them into left and right values (Hl, Hr, Ml, Mr, Sl, and Sr). Converts these decimal numbers into binary ones using dectobin function. Rotates the rows into columns using rotate1 or rotate2 function. Finally it filters the oputput through sed.
The data after rotating the rows into the columns but before filtering through sed looks like:
Code:
000101
000000
111000
011111
The same data filtered through sed and with the label looks like:
Code:
HH MM SS
| *| *|
| | |
**|* | |
*|**|**|
It’s easy to add the vertical lines to the output modifying printf command from:
Code:
printf '%s%s%s%s%s%s\n'
to:
Code:
printf '%s%s|%s%s|%s%s|\n'
The harder part is to convert all “0s” to spaces and all “1s” to asterisks. As you can see now I filter the output through the following command:
Code:
sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'
But I wonder if it’s possible to perform the above substitutions using printf from rotate1 function or awk from rotate2 function in order to avoid the additional filtering through sed.
Distribution: Slackware (personalized Window Maker), Mint (customized MATE)
Posts: 1,309
Original Poster
Rep:
PTrenholme,
I compared your rotate1 function to the rotate1 function suggested by David the H.
Benchmark for David the H. function:
Code:
#!/bin/bash
VAR1="0111"
VAR2="1100"
VAR3="1010"
rotate1() {
for (( i=0 ; i<${#VAR1} ; i++ )); do
printf '%s%s%s\n' "${VAR1:i:1}" "${VAR2:i:1}" "${VAR3:i:1}"
done
}
time for n in `seq 1 25000`
do
echo -e "$VAR1\n$VAR2\n$VAR3" | rotate1 > /dev/null
done
Benchmark for PTrenholme function:
Code:
#!/bin/bash
VAR1="0111"
VAR2="1100"
VAR3="1010"
rotate1()
{
local i j max
max=0
for i in $*
do
[ ${max} -lt ${#i} ] && max=${#i}
done
for (( i=0 ; i<${max} ; i++ ))
do
for j in $*
do
printf '%s' "${j:${i}:1}"
done
printf '\n'
done
}
time for n in `seq 1 25000`
do
rotate1 $VAR1 $VAR2 $VAR3 > /dev/null
done
The previous benchmarks mentioned in the post #5 I made after closing all applications and Internet connection because I wanted to compare not only times of the execution but also the CPU usages. The present benchmarks I made on the system running different applications and using Internet connection testing the times of the execution only. That explains the difference between the previous and the present results for David the H. function. Previously, on the idle system, the real time was 1m5.928s – at present, on the busy system, the real time is 1m55.536s.
The result for David the H. function:
Code:
real 1m55.536s
user 0m38.895s
sys 0m46.289s
The result for PTrenholme function:
Code:
real 0m28.595s
user 0m21.478s
sys 0m0.655s
So your rotate1 function works four times faster than David the H. function. It’s incredible improvement.
Of course your function is also more general so one doesn’t have to define in that function all the variables.
I’m really impressed by your work.
H_TeXMeX_H,
Quote:
Originally Posted by H_TeXMeX_H
Interesting script.
I’m not sure it’s a serious opinion or an ironic one.
I put together in my script the shell commands, sed, Perl, and AWK. My script is a melting pot of different programming techniques. I’m aware it’s possible to get the same result in a more elegant form using Perl or AWK only. Unfortunately I don’t know neither of them enough to code that.
So if it was irony it was justified.
***
My question from post #7 is still open: is it possible to avoid the substitution performed with sed:
Code:
sed -E 's/0/ /g;s/1/*/g;s/(..)(..)(..)/\1|\2|\3|/'
by implementing such a substitution into rotate1 function in order to change the rotated data such as:
Well, I mean it is interesting in terms of being a binary clock written in bash, and PTrenholme's frankenstein script written in many different languages, and yet has good performance. I meant no irony.
Distribution: Slackware (personalized Window Maker), Mint (customized MATE)
Posts: 1,309
Original Poster
Rep:
I prepared binary clock among the other scripts for the next release of wminfo (see: http://dockapps.windowmaker.org, or http://freecode.com, or http://slackbuilds.org). The wminfo program is a dockable application for Window Maker that displays different information using the plugins. The binary clock is such a plugin. The performance of the plugins becomes crucial when you run a dozen or so of instances of wminfo using different plugins. One poorly designed plugin isn’t a problem. Twelve poorly designed plugins can consume a lot of system resources. So I spend a lot of time testing and optimizing the plugins or writing the alternative versions of the most useful and the most demanding plugins.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.