LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Translate a array from bash to awk (https://www.linuxquestions.org/questions/linux-newbie-8/translate-a-array-from-bash-to-awk-4175575140/)

Ambalo 03-17-2016 08:52 AM

Translate a array from bash to awk
 
Hi


I've got a script in bash. Unfortunately I've to multiplicity all entries in array (witch have 4000 entries) with a number <0. bash dosen't that. Consequently I want translate the bash array into awk. Maybe it would be better to use awk in the hole script, but I have no experience in awk.

Could anyone help?

Thanks in advance :-)
Sebastian

pan64 03-17-2016 09:06 AM

looks like something like this works:
Code:

B=( echo ${A[@]} | awk ' BEGIN { RS=" " } { printf -0.25*$0 " " } ' )
(if I understand it well)

rtmistler 03-17-2016 09:07 AM

There are plenty of people here who I'm sure have great expertise with both BASH and AWK and could do it I'm sure.

You're asking for a handout. Instead post what you have in BASH and at least try to start with AWK.

Personally I'd suggest using a program. I don't see AWK to be a scripting language, but instead a command.

My other suggestion is that since you've leaned that AWK supports the capability to multiply negative numbers, then why not stage one calculation in AWK as your first try and then look to expand from that.

I'm sure reviewers would be very pleased to see something like:
  1. Here's my BASH script (use [code] tags by the way)
  2. Here's one calculation that I converted to AWK
  3. Any suggestions how I would apply that concept to an entire array? Do I just step through the list, or use array indexes?

Ambalo 03-17-2016 09:18 AM

Sorry. Here's my bash-script:

Code:

#!/bin/bash
ZS=($(find . -name "*.plt"))
i=1

Solutiontime=$((3.17/10000000))

for Dataname in ${ZS[@]}
do

  Cut=${Dataname:13}
  Timesteps=${Cut:0:${#Cut}-4}

Array[i]=$Timesteps

i=$((i++))
p=$((i++))
done

Amount=$p

for ((i=1;i<10;i++))
do

Array[i]=${i}00

done


for ((i=1;i<$Amount+1;i++))
do

Array[i]=$((${Array[i]}*$Solutiontime))

done


Array[0]=Solutiontime

I marked the problems. Than I tried this:

Code:

Variable=$(awk "BEGIN{print 10 ** -7}")

Solutiontime=$(awk "BEGIN{print 3.17 * $Variable}")

for ((i=1;i<$Amount+1;i++))
{

Array[i]=$(awk "BEGIN{print ${Array[i]} * $Solutiontime}")

}

But that doesn't work. Either I translate the array into awk, multiplicity and translate back to bash, or I write the hole program in awk.


I would be obliged if someone can help me.
Sebastian

pan64 03-17-2016 09:51 AM

I gave you a tip to go thru the array with a single awk. That will be much quicker.

you ought to use bc instead of awk:
Code:

Solutiontime=$(echo "3.17 * $Variable" | bc )

Ambalo 03-17-2016 10:26 AM

Sorry for my "low skill questions", but I use bash since 10 am.

Only to clarify what you say: You advice me to use
Code:

Variable=$(awk "BEGIN{print 10 ** -7}")
Solutiontime=$(echo "3.17 * $Variable" | bc )

in place of
Code:

Variable=$(awk "BEGIN{print 10 ** -7}")

Solutiontime=$(awk "BEGIN{print 3.17 * $Variable}")

Furthermore you tell me to do
Code:

B=( echo ${A[@]} | awk ' BEGIN { RS=" " } { printf -0.25*$0 " " } ' )
in place of
Code:

for ((i=1;i<$Amount+1;i++))
do

Array[i]=$((${Array[i]}*$Solutiontime))

done

If i do that, Terminal gave me a error called:
Code:

(standard_in) 1: parse error
-bash: 100*: syntax error: operand expected (error token is "*")

Please help me again.

Edit: I doesn't want multiply a number <0. I want multiply a number 0 < x < 1

grail 03-17-2016 12:06 PM

So I am not 100% on what the actual issue is, however, I see glaring problems in the current bash solution that do not include the use of decimal numbers.
Also, I am not really following the logic either :(
Code:

# below finds all files from current folder and below that have files of any type (including directories)
# that end in '.plt' ... so first issue I see is no limiting to files of type file only
ZS=($(find . -name "*.plt"))
# set 'i' variable to 1, but later you comfortably use standard for loop notation
i=1

# obviously the below will not work and either the bc or awk option can be used for correct calculation
Solutiontime=$((3.17/10000000))

# no double quotes around array value can cause word splitting in unwanted ways
for Dataname in ${ZS[@]}
do
  # remove the first 13 characters of the data name ... would be nice to know what that translates to
  Cut=${Dataname:13}
  # Cut is never used again, why not simply use 'Dataname' variable
  # Timesteps=${Dataname:13:${#Dataname}-17}
  Timesteps=${Cut:0:${#Cut}-4}

  # Set i'th element of array to 'Timesteps' value
  Array[i]=$Timesteps

  # Here it get tricky.  Set 'i' equal to the value of 'i' and then increase by 1 (so on first loop it will still be 1
  i=$((i++))
  # Now assign current value of 'i' (still 1) to 'p' and increase 'i' ... now 'i' is 2 on first loop
  p=$((i++))
  # Even though 'i' is at the correct value by this point, you can see a redundant step
done

# Set 'Amount' to final value of 'p'
Amount=$p

# The loop below now overwrites the first 10 elements of 'Array'
# Again, not following the logic here
for ((i=1;i<10;i++))
do

  Array[i]=${i}00

done

# This array loop has the ability to step off the end of the array ... not sure why we would want to do that
# It also (again) overwrites at least the first up to 10 values which will now have been saved into 'Array' 3 times
for ((i=1;i<$Amount+1;i++))
do

  # Again use bc or awk for non-integer math
  Array[i]=$((${Array[i]}*$Solutiontime))

done

# Not sure why zeroth element needs to be this value
Array[0]=Solutiontime

As you can see, I have many more questions than answers at this point in time :(

rtmistler 03-17-2016 12:15 PM

Quote:

Originally Posted by grail (Post 5517060)
So I am not 100% on what the actual issue is, however, I see glaring problems in the current bash solution that do not include the use of decimal numbers.
Also, I am not really following the logic either :(

I concur and my recommendation is that the OP consider working through the form of a single calculation they wish to attain, because of the following additional points they've made.
Quote:

Originally Posted by Ambalo (Post 5517010)
Sorry for my "low skill questions", but I use bash since 10 am.

Quote:

Originally Posted by Ambalo (Post 5516942)
Maybe it would be better to use awk in the hole script, but I have no experience in awk.

You wish to multiply one or more negative numbers. They are integers, or floating point numbers? I think you wish to multiply floating point numbers and you say one of them could be negative. I'd solve the general problem of multiplying two floating point numbers of any sign and once you understand how to do that in any script or language you end up using, then look to improve this into dealing with all elements of an array.

I think AWK can better deal with the floating point calculations and BASH can better deal with the array manipulations. So I'd learn how to do the calculations using AWK and write a BASH script to manage the processing of each data point.

Ambalo 03-17-2016 12:19 PM

Let me explain:

I have 4000 datafiles who exported from a fluid dynamic program. The filename are "TecN210500-xxxxx.plt". "xxxxx" are the iteration number. And i need the iteration number to calculate the solution time of my simulation. The first loop manipulated the filename, so that only the "xxxxx" are written in the array. The first 9 filenames like "TecN210500-0100.plt"; "TecN210500-0200.plt".... Furthermore the variable "p" is needed, because after every loop, the variable i equal 1.

The second loop delete the first zero because bash can't calculate with numbers with a leading zero. The variable "amount" counts the datafiles for the last loop.

The last loop multiply every entry of the array with the "Time Step" of the simulation. After doing that, I want write these information into the datafile "TecN210500-xxxxx.plt". The first line in the datafile like "Static Pressure". So I write in the first line of the array "Solutiontime".

Hope it's clearer now :) and sorry for my bad english....

Ambalo 03-17-2016 12:26 PM

I don't want multiply negativ numbers. Sorry for misunderstanding.

The math operations are:
100 * 3.17*10^-7
200 * 3.17*10^-7
300 * 3.17*10^-7
........

Input data example (overall 4000 datafile):
TecN210500-0100.plt

TecN210500-0200.plt

TecN210500-0300.plt

TecN210500-0400.plt

......

0100 are the iteration number. The first loop extract the "0100" from the filename. The second loop "delete" the leading zero -> "100". The last loop multiply with the time step -> "100 * 3.17*10^-7"

grail 03-17-2016 03:37 PM

All this extra information is more helpful :)

I agree with rtmistler that you should solve for a single solution first as the looping part is a no-brainer.

I see now the part of the filename you wish to extract and whilst there are other ways to do this I still see issues with the logic of how you have gone about some things.
I was going to say that the second loop does not remove the leading zero from your data, but I now see that your solution to do this was to overwrite the first 10 elements.
This method obviously works, as long as the format never changes in the future.

I am slightly perplexed on a point though, if all the file names are uniform and are always in amounts of 100 up to 4000, why not just perform the task in a for loop from 100 to 4000 step 100?
Unless I have missed that these numbers may change I am not seeing why you need to go through all the trouble of stripping the file names and so on.

If the above is correct, this becomes a trivial loop where awk could then be used to perform the task:
Code:

awk 'BEGIN{for(i = 100; i <= 4000; i+=100)array[j++] = i * (3.17 * 10**-7)}'
This simply creates the array so whatever it is you want to do with it is up to you.

Ambalo 03-18-2016 06:30 AM

Thank you for response.

Quote:

I was going to say that the second loop does not remove the leading zero from your data, but I now see that your solution to do this was to overwrite the first 10 elements.
This method obviously works, as long as the format never changes in the future.
The first elements are always the same.

Quote:

I am slightly perplexed on a point though, if all the file names are uniform and are always in amounts of 100 up to 4000, why not just perform the task in a for loop from 100 to 4000 step 100?
Not every simulation had 4000 steps. That's why I count the amount of datafile.

Quote:

Unless I have missed that these numbers may change I am not seeing why you need to go through all the trouble of stripping the file names and so on.
The problem is the export of the fluid dynamic program. The solution time are not in the export datafile so I have to go a long way round with the iteration steps.

The time steps are not from 100 to 4000 with step 100. There are 100 * 4000 iterations. Every export have 100 iterations included. And 4000 export files are created. Thanks for the awk array. Actually really simple. I will try with your posted code. :)

MadeInGermany 03-18-2016 03:09 PM

As pan64 already suggested, the RS=" " in awk is elegant, efficient, and has no limits.
Code:

# Array[0..9] := 100..1000
Array=( {1..10}00 )
echo ${Array[@]} | awk '{print $1*3.17E-7}' RS=" "

Assigning this to another array: you must enclose it with ` ` or $( ) in order to get a string/list, then enclose it by ( ) to feed the list into an array.
Code:

Array=( `echo ${Array[@]} | awk '{print $1*3.17E-7}' RS=" "` )

Ambalo 03-21-2016 08:51 AM

Thank you all for the good advices.

Another problem appeared:

The first solution times are:
Code:

3,17e-05
6,34e-05
9,51e-05

with following command:

Code:

awk "BEGIN {print $Timestep * 3.17 * 10**-7 }"
I want remove the "e-0x"

If I do this:
Code:

awk "BEGIN {print $Timestep * 3.17 * 10**-7 }" | awk '{ print sprintf("%.9f", $1)}'"
I get this output:
Code:

0,000031700
0,000063400
0,000095100

But I've got a problem with the results.

Following results issued:
Code:

Timestep  Solutiontime        Exact Solutiontime
399000        0,126483        0,126483
399100        0,126515        0,1265147
399200        0,126546        0,1265464
399300        0,126578        0,1265781
399400        0,12661                0,1266098
399500        0,126641        0,1266415
399600        0,126673        0,1266732
399700        0,126705        0,1267049
399800        0,126737        0,1267366
399900        0,126768        0,1267683
400000        0,1268                0,1268

(Exact Solutiontime = Timestep * 3.17 * 10^-7)

Why awk rounds up? And how can I deactivate this command?

pan64 03-21-2016 08:57 AM

probably this helps:
https://www.gnu.org/software/gawk/ma...ogramming.html
https://www.gnu.org/software/gawk/ma...act-Arithmetic


All times are GMT -5. The time now is 08:30 PM.