LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   calculation of average from a csv file using shell script (https://www.linuxquestions.org/questions/linux-newbie-8/calculation-of-average-from-a-csv-file-using-shell-script-4175621321/)

rakhighosh 01-09-2018 10:36 PM

calculation of average from a csv file using shell script
 
I have a csv file having values of four different parameters . The values of the parameters for a single day are many and I want to find the average of the values using a shell script.
My csv file looks like this

year month date time rainfall(cm)
2012 12 5 10.00 12
2012 12 5 12.00 14
2013 04 3 10.00 16
2013 04 3 12.00 17
I want to calculate the average of rainfall for a single day.How to write a shell script for this?

grail 01-09-2018 11:45 PM

Welcome to LQ :)

Please show what you have attempted so far?

astrogeek 01-10-2018 12:22 AM

Agreed it would be nice to show your efforts to this point.

I think a simple awk script would be an easy solution which would also provide decimal average values which a shell script will not do.

By way of encouragement I wrote a quick and dirty awk which does the following pseudo-code (assumes dates are not interleaved, prints average for each date):

Code:

BEGIN {
      #...initialize date match, count and total vars
}
Same date{
      #Increment count, add amount to total
      next
}
Count > 0{
      #Change of date so print last date and average, reset date match, count and total vars
      next
}
{
      #First line, set initial date match, increment count, set total to amount
}
END{
      #Last line, print last date and average
}

When run with your example data it produces this...

Code:

awk -f rain.awk rain_data
Average: 2012 12 5 13
Average: 2013 04 3 16.5

I leave the actual code as an exercise for you! (Pretty simple!)

Sefyir 01-10-2018 01:09 AM

If interested in interpreting large amounts of data, you can try using pandas with python
https://pandas.pydata.org/

Code:

import pandas as pd
data = pd.read_csv('rainfall.csv')

data_cleaned = data.drop(['time'], axis=1)
data_cleaned.groupby(['year', 'month', 'date']).mean()

                rainfall(cm)
year month date             
2012 12    5            13.0
2013 4    3            16.5


syg00 01-10-2018 01:36 AM

That's pretty impressive ... ;)

ISTR Sefyir pushing this barrow before.
I keep trying to convince myself there must be a reason to learn python ... maybe there is.


All times are GMT -5. The time now is 04:49 AM.