LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Shell script help with function and output format (https://www.linuxquestions.org/questions/programming-9/shell-script-help-with-function-and-output-format-487462/)

altrob 09-27-2006 01:13 PM

Shell script help with function and output format
 
Greetings,

I'm new to writing shell scripts but after searching through this forum I've gotten most of what I want. However, I'm still having an issue. I am attempting to backload data from hundred's of large csv's into RRD. Essentially I want to parse out info from hundred's of columns in hundred's of files into 4 columns of data I want in one file. Essentially the date in unix time, the min, max, and average of a reading (say CPU) for the day, compiling all the days data into one file. The problem is I can't get my head around how to get the data to return on the same line.

data:

file1.csv
4 lines of headers I don't need
1-JUL-2006 17:41:00.00,111.42,xxx,yyy,zzz,etc..
1-JUL-2006 17:42:00.00,69.87,xxx,yyy,zzz,etc..
1-JUL-2006 17:43:00.00,101.10,xxx,yyy,zzz,etc..
etc...

file2.csv
4 lines of headers I don't need
2-JUL-2006 17:41:00.00,111.42,xxx,yyy,zzz,etc..
2-JUL-2006 17:42:00.00,69.87,xxx,yyy,zzz,etc..
2-JUL-2006 17:43:00.00,101.10,xxx,yyy,zzz,etc..

etc..

What I have is:
Code:

#!/bin/bash
destdir=/home/t4data/working
for f in $destdir/*.csv
        do
                tail -1 $f |cut -c1-12| while read line # gives me the date for the file
                        do
                                date -d "$line" +%s        # convert to unix time
                        done
                tail -n+5 $f |awk -F, ' { print $7 } ' |awk -f sum.awk # min,max,ave
        done

where sum.awk is

Code:

$ cat sum.awk
{
    if (NR == 1) {
        sum=min=max=$1;
    } else {
        sum += $1;
        min = (min < $1) ? min : $1;
        max = (max > $1) ? max : $1;
    }
}
END {
#    print "The average is: " sum/NR " min: " min " max: " max;
    print " min: " min " max: " max " average: "sum/NR;
}

Output:

1153022400
min: 44.45 max: 666.22 average: 136.055
1153108800
min: 51.28 max: 1298.38 average: 374.275
etc...

I realise why the data returns on two separate lines, and I thought it would be simple to correct, but no matter what I try I can't seem to get to my desired format of:

1153022400 min: 44.45 max: 666.22 average: 136.055
1153108800 min: 51.28 max: 1298.38 average: 374.275
etc...

Thoughts?

acid_kewpie 09-27-2006 02:05 PM

bit lost here, are you saying you're only looking at the very last line of each file? if so why is there a do loop there?

now presumably you simply want to get a timestamp to a variable, and then get the rrd data to a variable and then print them both?
Code:

#!/bin/bash
destdir=/home/t4data/working
for f in $destdir/*.csv
        do
                lastline=$(tail -1 $f |cut -c1-12)
                rrddate=$(date -f $lastline +%s)
                rrddata=$(tail -n+5 $f |awk -F, ' { print $7 } ' | awk -f sum.awk )
                echo $rrddate $rrddata
        done

i'd be thinking those tails and whatnot could do with looking at but i'm not in a position to suss that out without breaking it somewhere... but is that not the kind of result you're after?

druuna 09-27-2006 02:10 PM

Hi,

Just looking at what you asked:

Change this:

date -d "$line" +%s # convert to unix time

Into:

thisDate=`date -d "$line" +%s` # convert to unix time
echo -n $thisDate


Make it a two step process.
1) put date in a variable,
2) output this variable using echo -n (or a printf construct if you like that better).

The echo -n makes sure that there is no newline after it prints the date.

Hope this helps.

altrob 09-27-2006 02:39 PM

Quote:

Originally Posted by acid_kewpie
bit lost here, are you saying you're only looking at the very last line of each file? if so why is there a do loop there?

I was only looking at the last line of each file for the date parameter. Each file had hundreds of lines of data, but each line of a particular file has the same date. Parse a file down to one line for date, min, max, ave for the day, move to next file, repeat.
Code:

#!/bin/bash
destdir=/home/t4data/working
for f in $destdir/*.csv
        do
                lastline=$(tail -1 $f |cut -c1-12)
                rrddate=$(date -f $lastline +%s)
                rrddata=$(tail -n+5 $f |awk -F, ' { print $7 } ' | awk -f sum.awk )
                echo $rrddate $rrddata
        done

I had previously done various forms of what you suggested ( I only posted my most recent attempt), but each one, including your code resulted in a newline ie

1153022400
min: 44.45 max: 666.22 average: 136.055
1153108800
min: 51.28 max: 1298.38 average: 374.275

Quote:

Originally Posted by druuna

thisDate=`date -d "$line" +%s` # convert to unix time
echo -n $thisDate


Make it a two step process.
1) put date in a variable,
2) output this variable using echo -n (or a printf construct if you like that better).

The echo -n makes sure that there is no newline after it prints the date.

Hope this helps.

This is exactly what I needed. I tried working with echo -n without success, but I wasn't getting the variable set properly with the tilde (though I thought I tried that).

I works perfectly! Thanks very much to both of you for your help, especially druuna for providing the key that I was missing.

cheers!

altrob 09-27-2006 02:51 PM

Quote:

Originally Posted by altrob
Code:

#!/bin/bash
destdir=/home/t4data/working
for f in $destdir/*.csv
        do
                lastline=$(tail -1 $f |cut -c1-12)
                rrddate=$(date -f $lastline +%s)
                rrddata=$(tail -n+5 $f |awk -F, ' { print $7 } ' | awk -f sum.awk )
                echo $rrddate $rrddata
        done

I had previously done various forms of what you suggested ( I only posted my most recent attempt), but each one, including your code resulted in a newline

Let me correct that. That does indeed work acid_kewpie if you change the date -f to date -d. The error from date -f introduced a newline when I quickly tried it. So you both gave me a working solution, both of which I was dancing around but obviously not getting it exactly as needed. Thanks to both for such quick responses!


All times are GMT -5. The time now is 02:55 PM.