LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-08-2013, 11:31 PM   #1
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Rep: Reputation: 1
two line averaging


hi guys,

i have output data from one code that needs to get read into another, but the second code can't accept as high of data rate, so i need to average between every two lines. there are seven columns in the file and can be 100 or more rows, but the EOF will be an even number, and the second column can be ignored. it's a constant. here's an output to input example:

the output of the first code would look like

line1 42 a b c d e
line2 42 aa bb cc dd ee
line3 42 aaa bbb ccc ddd eee
line4 42 aaaa bbbb cccc dddd eeee
.
.
.


so with the above example the input file to the next code would look like

line1 42 ave(a,aa) ave(b,bb) ave(c,cc) ave(d,dd) ave(e,ee)
line2 42 ave(aaa,aaaa) ave(bbb,bbbb) ave(ccc,cccc) ave(ddd,dddd) ave(eee,eeee)
.
.
.

tabby
 
Old 10-09-2013, 12:08 AM   #2
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
I have an idea of how I would do it

what have you tried?

what OS ? ( I see a Mac badge )
 
Old 10-09-2013, 10:40 AM   #3
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
good morning Firerat (of the Princess Bride Swamp Firerats? i love that movie!)

i couldn't sleep laying in bed last night so i typed it from my mac laptop. my desktop is RHEL 5.5 Tikanga

so far i have:

Code:
#!/usr/bin/awk -f

BEGIN   {
        max=0
        }
        {
        if($5>max) max=5        
        }
END     {
        {
BEGIN   {
        min=0
        }
        {
        if($5>min) min=5        
        }
END     {
        {
!(NR%5) {
        sum+=5        
        ++n
        }
END     {
        print "average = sum/n
        {
but this only runs on the 5th column and i haven't figured out how to extend the averaging to all columns

and

Code:
#!/usr/bin/awk -f
awk 'NR1 {sum+=$5; ++n} END  {print "average = " sum/n}' output_file.dat
but this only runs on the 5th column and i haven't figured out how to extend the averaging to all columns

thanks for your help!!!

tabby

Last edited by tabbygirl1990; 10-09-2013 at 10:42 AM.
 
Old 10-09-2013, 11:17 AM   #4
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
do you have a better example of the inputs?

can you give 10 lines, and also show the result you expect


for instance, do you want the average of all the rows for each column

input
Code:
Line1 1
Line2 2
Line3 3
Line4 4
Code:
Line1 1
Line2 2
Line3 3
Line4 4
mean  5
or ..
Code:
Line1 1
Line2 2
mean  1.5
Line3 3
Line4 4
mean  3.5
 
Old 10-09-2013, 11:32 AM   #5
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,026

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
the original poster made a good faith effort so even if this is homework i think this guidence wont be cheating (although i have no idea where the variable n above comes from -- i get division by 0 errors when i try to run it as is because it is never defined).
heres my stab at it... (i did first 2 feilds only because i got bored -- season to taste):
Code:
[schneidz@hyper ~]$ cat tabbygirl1990.txt
1 2 3 4 5 6 7
7 6 5 4 3 2 1
100 200 300 400 500 600 700
10 20 30 40 50 60 70
5 10 15 20 25 30 35G
0 1 1 2 3 5 8
[schneidz@hyper ~]$ awk 'NR % 2 == 0 {sum1+=$1;sum2+=$2} NR %2 == 1 {sum1=$1;sum2=$2}  NR % 2 == 0 {print "average-1 = " sum1/2 " -- average-2 = " sum2/2 }' tabbygirl1990.txt
average-1 = 4 -- average-2 = 4
average-1 = 55 -- average-2 = 110
average-1 = 2.5 -- average-2 = 5.5

Last edited by schneidz; 10-09-2013 at 11:34 AM.
 
Old 10-09-2013, 11:56 AM   #6
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
here's the innie

Code:
1	42	0.19796486	0.362090835	0.354344909	0.856582877	0.735671789
2	42	0.025016951	0.12691389	0.210235925	0.417773321	0.091685902
3	42	0.610085038	0.445050311	0.756565733	0.180007685	0.216628711
4	42	0.458264832	0.359423811	0.488949963	0.073800802	0.091902447
5	42	0.268522443	0.648344889	0.983886158	0.436349095	0.949035235
6	42	0.176264501	0.059806075	0.860509502	0.488146158	0.240509861
7	42	0.89882842	0.004340198	0.959885061	0.083707755	0.636907775
8	42	0.175407396	0.752946341	0.037497858	0.738027088	0.59901326
9	42	0.929893486	0.110036987	0.109945346	0.788329303	0.303932011
10	42	0.788359742	0.356803805	0.954558374	0.93942156	0.474722704
and the outie

Code:
1	42	ave(col3,row1&row2)	ave(col4,row1&row2)	ave(col5,row1&row2)	ave(col6,row1&row2)	ave(col7,row1&row2)
2	42	ave(col3,row3&row4)	ave(col4,row3&row4)	ave(co5,row3&row4)	ave(col6,row3&row4)	ave(col7,row3&row4)
3	42	ave(col3,row5&row6)	ave(col4,row5&row6)	ave(col5,row5&row6)	ave(col6,row5&row6)	ave(col7,row5&row6)
4	42	ave(col3,row7&row8)	ave(col4,row7&row8)	ave(col5,row7&row8)	ave(col6,row7&row8)	ave(col7,row7&row8)
5	42	ave(col3,row9&row10)	ave(col4,row9&row10)	ave(col5,row9&row10)	ave(col6,row9&row10)    ave(col7,row9&row10)
thanks soooo much firerat!!!

tabby
 
Old 10-09-2013, 12:09 PM   #7
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Maybe you need something like this:
Code:
#!/usr/bin/awk -f
NR % 2 {
  for (i=3;i<=NF;i++)
    _[i]=$i
  getline
  printf "%d\t%d", ++c, $2
  for (i=3;i<=NF;i++)
    printf "\t%f", ($i+_[i])/2
  print ""
}
 
Old 10-09-2013, 01:05 PM   #8
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
hi schneidz,

i tried your awk command and i got syntax errors, and i don't understand what's
Code:
 --
so i modified it to

Code:
awk 'NR % 2 == 0 {sum1+=$1;sum2+=$2} NR %2 == 1 {sum1=$1;sum2=$2}  NR % 2 == 0 {print "average-1 = " sum1/2} {print "average-2 = " sum2/2}' tabbygirl1990.dat
what i got out was

Code:
average-2 = 21
average-1 = 1.5
average-2 = 42
average-2 = 63
average-1 = 5
average-2 = 84
average-2 = 105
average-1 = 10.5
average-2 = 126
average-2 = 147
average-1 = 18
average-2 = 168
average-2 = 189
average-1 = 27.5
average-2 = 210
which is yesterday's makeup
 
Old 10-09-2013, 01:44 PM   #9
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,026

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
can you please copy-paste the command and the error you are getting ?
 
Old 10-09-2013, 02:07 PM   #10
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
./Script.sh /path/to/tabbygirl1990.dat
Code:
#!/bin/bash
Input="$1"
tick=1
LineNo=1
while read Line;do
    case $tick in
        1)
           X=($Line)
           tick=2
        ;;
        2)
           Y=($Line)
           tick=1
           printf "%s" "Line${LineNo} ${X[1]}"
           for i in {2..6};do
               awk '{printf "\t%.11f",($1 + $2)/2}' <<< "${X[i]} ${Y[i]}"
           done
           printf '\n'
           LineNo=$(($LineNo+1))
        ;;
    esac
done < $Input
Input
Code:
1	42	0.19796486	0.362090835	0.354344909	0.856582877	0.735671789
2	42	0.025016951	0.12691389	0.210235925	0.417773321	0.091685902
3	42	0.610085038	0.445050311	0.756565733	0.180007685	0.216628711
4	42	0.458264832	0.359423811	0.488949963	0.073800802	0.091902447
5	42	0.268522443	0.648344889	0.983886158	0.436349095	0.949035235
6	42	0.176264501	0.059806075	0.860509502	0.488146158	0.240509861
7	42	0.89882842	0.004340198	0.959885061	0.083707755	0.636907775
8	42	0.175407396	0.752946341	0.037497858	0.738027088	0.59901326
9	42	0.929893486	0.110036987	0.109945346	0.788329303	0.303932011
10	42	0.788359742	0.356803805	0.954558374	0.93942156	0.474722704
Output
Code:
Line1 42	0.11149090550	0.24450236250	0.28229041700	0.63717809900	0.41367884550
Line2 42	0.53417493500	0.40223706100	0.62275784800	0.12690424350	0.15426557900
Line3 42	0.22239347200	0.35407548200	0.92219783000	0.46224762650	0.59477254800
Line4 42	0.53711790800	0.37864326950	0.49869145950	0.41086742150	0.61796051750
Line5 42	0.85912661400	0.23342039600	0.53225186000	0.86387543150	0.38932735750
But,, the numbers don't really make much sense
0.19796486 and 0.025016951 are very different, thus leading to mean of 0.11149090550
 
Old 10-09-2013, 03:22 PM   #11
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
colucix does the trick

can you explain what the _[i] is doing? i know the "i" is an iterator but the rest of it i haven't seen before

also there are two printf and one print statement, i understand what the last print is doing but what are the printf statments doing in this script, i mean i know what a printf is just not what they are doing in the script

thanks guys!!!

tabby
 
Old 10-10-2013, 04:07 AM   #12
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
1. The
Code:
_[i]
notation is simply the i-th element of the array _ (often I use a single underscore as variable name for brevity).

2. The first printf statement prints out the new line number using a C notation to increment the variable c by one, before it's valued is used
Code:
++c
take in mind that an unitialized variable in awk has value 0.

3. The second printf statement prints out the average of the i-th field, as per your requirement. It is the body of the second for loop, which is executed from the 3rd field to the last one.
 
Old 10-10-2013, 10:40 AM   #13
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
thanks!

i thought that the _[] was some kind of special character/operator on i

i know that ++ is standard C notation for iterate over (although i'm not sure of the diff between ++i or i++ i'll try to find out), but i hadn't seen the little c before so I didn't know that it is a variable

so are the printf statements, kinds like storing/holding the data in the for loops

thanks soooo much,

tabby
 
Old 10-15-2013, 11:54 AM   #14
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
back again guys,

now i'd like to create averaged lines of data from their nearest neighbors, so like here's an input file
Code:
1	42	0.0	0.4	0.3	0.8	0.7
2	42	0.3	0.1	0.2	0.4	0.1
in the output file the newline of averaged number is line #2, line #1 is the same as the input, and line #3 is line #2 from above, if this thing was run on a 100line file, i think it would give back 198 lines, right?

Code:
1	42	0.0	0.4	0.3	0.8	0.7
2       42      0.15    0.25    0.25    0.6     0.4
3	42	0.3	0.1	0.2	0.4	0.1
ok so here i don't know how to create the new in-between line? and how do i "store" the memory of the two nearest neighbors lines so i can average them? when i know how to do those two things, i should hope i could write it???

tabby

Last edited by tabbygirl1990; 10-15-2013 at 12:12 PM.
 
Old 10-15-2013, 12:24 PM   #15
tabbygirl1990
Member
 
Registered: Jul 2013
Location: a warm beach, cool ocean breeze, nice waves, and a Margaritta
Distribution: RHEL 5.5 Tikanga
Posts: 63

Original Poster
Rep: Reputation: 1
maybe a better example input file cause it has more lines

Code:
1	42	0.0	0.4	0.3	0.8	0.7
2	42	0.3	0.1	0.2	0.4	0.1
3       42      0.0     0.1     0.0     0.2     0.4
4       42      0.7     0.1     0.0     0.0     0.8
5       42      0.3     0.2     0.3     0.8     0.1
in the output file the newline of averaged number is line #2, line #1 is the same as the input, and line #3 is line #2 from above, if this thing was run on a 100line file, i think it would give back 198 lines, right?


Code:
1	42	0.0	0.4	0.3	0.8	0.7
an odd index line
2	42	0.3	0.1	0.2	0.4	0.1
an odd index line
3       42      0.0     0.1     0.0     0.2     0.4
an odd index line
4       42      0.7     0.1     0.0     0.0     0.8
an odd index line
5       42      0.3     0.2     0.3     0.8     0.1
so my first thought would be to go through the file once creating the in-between lines, these would all be odd indexed lines, then some way refernce the script only to work on odd indexed lines, right?

Last edited by tabbygirl1990; 10-15-2013 at 12:25 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] [Bash] Totalling & Averaging in one go blenderfox Programming 19 07-07-2013 06:32 AM
Averaging columns from multiple files carlr Programming 3 03-18-2012 02:24 AM
alsa - averaging stereo to mono for speaker output bdjnk Linux - General 0 08-18-2009 03:38 PM


All times are GMT -5. The time now is 10:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration