[SOLVED] processing 9 scripts sequentially or combing them into one

atjurhs · 05-05-2022, 10:28 AM

and i got it with the filtering, YEA!!! this 1st step of the project is done

thanks so much to all you helped!!!

Tabby

atjurhs · 05-05-2022, 11:11 AM

well i guess i'm not done with this just yet

i didn't think (and i should have) that there are several csv inputFiles in my directory (all with unique names of course) so the csv outputFiles also need to have unique names (the script that works doesn't do that), and it would be best if the outputFile's name was automated as part of the script (not changed each time i run the script) based on the prefix of the inputFile's name, for example:

Code:

awk_script 12.3_input.csv > outputs 12.3_output.csv
awk_script 45.6_input.csv > outputs 45.6_output.csv
awk_script 78.9_input.csv > outputs 78.9_output.csv
.
.
.

i'll give credit to boughtonp for the awk script (up above) that he did almost all of the work on, so i'm willing to do the leg work on this one if someone can point me in the right direction...

eventually i plan to "batch" process all the input files in the directory...

Tabby

computersavvy · 05-05-2022, 07:37 PM

Quote:

Originally Posted by atjurhs

well i guess i'm not done with this just yet

i didn't think (and i should have) that there are several csv inputFiles in my directory (all with unique names of course) so the csv outputFiles also need to have unique names (the script that works doesn't do that), and it would be best if the outputFile's name was automated as part of the script (not changed each time i run the script) based on the prefix of the inputFile's name, for example:

Code:

awk_script 12.3_input.csv > outputs 12.3_output.csv
awk_script 45.6_input.csv > outputs 45.6_output.csv
awk_script 78.9_input.csv > outputs 78.9_output.csv
.
.
.

i'll give credit to boughtonp for the awk script (up above) that he did almost all of the work on, so i'm willing to do the leg work on this one if someone can point me in the right direction...

eventually i plan to "batch" process all the input files in the directory...

Tabby

Code:

for INFILE in *.csv
do
   OUTFILE=OUT"$INFILE"
   Process your scripts here reading from "$INFILE" and sending the output to "$OUTFILE" 
done

This will process all the files in the current directory and manage both reading each file and renaming the output.

Once you have one script (or group of scripts) that gives the result you want simply enclose them in a "for" loop like above to 'batch' process all the files.

atjurhs · 05-06-2022, 10:14 AM

well i tried this a couple ways

Code:

for INFILE in *.csv
do
   OUTFILE=OUT"$INFILE"
   awk script from above 
done

and

Code:

for file in *.csv;
do
    awk script from above
done

both gave syntax error near unexpected tokens and commands not found

however i did get the script to run as an awk command line

Tabby

michaelk · 05-06-2022, 10:31 AM

What was posted was not code to use verbatim...
If a file is named 12.3_input.csv, output file is 12.3_output.csv

Code:

#!/bin/bash
for file in *.csv
do
  outfile=${file/input/output}
  awk_script $file > $outfile
done

atjurhs · 05-06-2022, 11:04 AM

yikes, michaelk, what you shared didn't create an outfile and it deleted the data in the input file.

before trying to implement a for loop, the code currently stands as

Code:

#! /bin/bash

BEGIN {FS=OFS=","}
NR ==1 {print $3, $4, $"ABC", "DEF", "GHI", $9, $10, $11, $13; next}
{
SplitCol = 5
  split($SplitCol,ColArray,"_")
  ColOffset = length(ColArray)-1
  for(i=NF;i>SplitCol;--i)
  $(i+ColOffset)=$1
  for(i in ColArray)
  $(i+SplitCol-1)=ColArray[i]
{
if($9==435.0 && $17==100.0)
  print $3, $4, $6, $8, $9, $13, $14, $15, $17;
}
}

Code:

i execute this using awk -f script.awk inputFile.csv > outputFile.csv

but don't get me wrong, i do appreciate your helping me!

Tabby

pan64 · 05-06-2022, 11:25 AM

Would be nice if you learn awk and try to adjust that script to your needs instead of instruct us to do that job.
From the other hand in your last post you did not ask anything, so I don't really know if that is ok for you or do you still need help?

michaelk · 05-06-2022, 11:38 AM

Sorry, now is a good time to backup your data if you don't already have one. You need to post exactly the input file names and the desired output if you want to use posted code verbatim. It took many posts to finally get you to post the exact data more or less and the exact desired output to keep from making the same mistakes throughout this thread.

You also might want to consider saving the files to another directory to keep the input files separate from the output files.

There are many many ways to modify strings to produce any naming scheme you desire.

As a separate script that runs your awk script.

Code:

#!/bin/bash
for infile in *.csv
do
  #outfile=${file/input/output} 
  awk -f script.awk $inFile.csv > /path/to/output_directory/$outFile.csv
done

Just an FYI since you are running your script via awk (awk -f script.awk ...) the #!/bin/bash is treated as a comment.

MadeInGermany · 05-06-2022, 11:43 AM

Your script, say its name is script.awk, is a pure awk script, has got the wrong shebang (#! in the first line)
Should be
#! /usr/bin/awk -f

But if you call it explicitly with the awk interpreter then the shebang is not considered (becomes a comment).

The other script is a shell script

Code:

#!/bin/bash
matchstr=input
newstr=output
for file in *$matchstr*.csv
do
  outfile=${file/$matchstr/$newstr}
  awk -f script.awk "$file" > "$outfile"
done

As you can see, the bash script loops thru the files and for each file runs the script.awk

atjurhs · 05-06-2022, 11:45 AM

hi pan64, i guess i'm learning awk as i go, doing my best to search through various resources i find on the net and applying the info i learn as best i can. i'm not a coder by any means and i don't have access to a coder to help me, so i struggle along doing my best. i've never seen an on-line class for awk, or i'd take it. (other programing languages look too intimidating), and really all i ever need to do is data manipulation on csv files and i think awk and sed are well suited for this. hopefully this answers your question?

atjurhs · 05-06-2022, 12:42 PM

michaelk your script almost works

when i remove the .csv from $infile.csv your script correctly processes the last file (none of the proceeding files) and renames that one outfile as outfile.csv.csv

MadeInGermany your script does not create any outfiles.csv and it deletes all the data in the infiles.csv ooops

atjurhs · 05-06-2022, 01:05 PM

tinkering around with what you guys showed me, i wrapped the script.awk with a for loop like this

Code:

#!/bin/bash
for file in *.csv;
 do
   awk 'BEGIN... the rest of the awk script
       }' "$file" > "$(basename "file").csv"
done

and it worked

only the output files have a double csv extensions, so they look like blahblablah.csv.csv so i'm really really close. how can i get rid of the double csv extension? or even better would be to rename the outputfile (i'm working on that one). i'm thinking of maybe a call to sed before the done?

thanks soooo much!!!

Tabby

computersavvy · 05-06-2022, 07:46 PM

Quote:

Originally Posted by atjurhs

tinkering around with what you guys showed me, i wrapped the script.awk with a for loop like this

Code:

#!/bin/bash
for file in *.csv;
 do
   awk 'BEGIN... the rest of the awk script
       }' "$file" > "$(basename "file").csv"
done

and it worked

only the output files have a double csv extensions, so they look like blahblablah.csv.csv so i'm really really close. how can i get rid of the double csv extension? or even better would be to rename the outputfile (i'm working on that one). i'm thinking of maybe a call to sed before the done?

thanks soooo much!!!

Tabby

Try changing this

Code:

 awk 'BEGIN... the rest of the awk script
       }' "$file" > "$(basename "file").csv"

to this

Code:

 awk 'BEGIN... the rest of the awk script
       }' "$file" > "out$(basename -s .csv "$file").csv"

That will allow the basename command to strip the original .csv from $file and add "out" at the beginning of the original file name for the output file.
YOu could also simplify that with

Code:

 awk 'BEGIN... the rest of the awk script
       }' "$file" > out"$file"

so all that is added is the prefix out to the original file name.

There are lots of variations possible so you have to decide what works best for your needs.

MadeInGermany · 05-07-2022, 07:10 AM

I have edited my post, so it won't write back to the input file if nothing matched.
--

Code:

#!/bin/bash
for file in *.csv;
 do
   awk 'BEGIN... the rest of the awk script
       }' "$file" > "$(basename "file").csv"
done

Even if it were "$file" I do not see the point of using basename here.
But you must write to a new file, writing back to the original file will trash it.
The following strips .csv and adds .out

Code:

#!/bin/bash
for file in *.csv;
 do
   awk 'BEGIN... the rest of the awk script
       }' "$file" > "${file%.csv}.out"
done

atjurhs · 05-09-2022, 09:48 AM

Thanks guys, you've all been a big help!!!

Tabby