awk comparing 2 rows and counting
Hi. I have a file that looks like this
1 1 2 3 2 1 3 2 3 1 2 3 4 2 1 3 and so on. column 1 describes a time and the rest are temperatures that may swap. Id like to compare in column 2 to 3 line 1 with line 2. If the value is the same count 0 if it differs count 1. Then compare line 2 and 3 and so on till the end. In the end i want to know how many times the value of each line in column 2-4 changed. Id like to use a script in awk. But i got no clue how to define this. awk '{current = $NF;getline; if($NF == current}print "match";else print "mismatch"}' file this i found in another thread. It compare the lines and tells if its a match or mismatch. Instead of such an output id like a count at the end of how many mismatches there have been i think. Thanks for the help :D |
You'll have to "remember" the values from the last line in each iteration. So, for each line:
1) compare the values stored from the last iteration to the values from the current line. 2) if the numbers are different, increment counter 3) copy the values from the current line to the variables so you can access them in the next iteration. You will also have to think about how to treat the first line. |
Ok sry for the german. What i said was i think maybe like this?
awk ' {current=$NF ; getline ; if ($NF != current)} print "++; else 0" Though the 0 could mean that its not adding but writing just zero when there is no change. The first row cant be compared to anything before so i guess the value should be 0? Ive never programmed anything before so im kinda confused :/ maybe more like this? awk '{current=$NF ; getline ; if($NF!=current{count++})} print "count" |
Ok ... so putting that into google translate helped a little :)
So getline is not needed at all. NR is the current line count so this could be used to know when at line 1 or elsewhere. When needing to print something when you are finished getting your data you need to investigate the END{} clause. Here is the link to the manual online which I recommend reading: http://www.gnu.org/software/gawk/man...ode/index.html Read over millgates information again and use the page above it should be fairly straight forward. |
as i understand
current=$NF defines that the line read at the moment is stored as a variable called NF so when the next line is read it can compare the new current line with the line before. then i have to tell it to actually compare by $NF!=current and somehow tell it if that is true count +1. ++ is the same as +1 right? Then i want to keep track of the total count and get the total number printed. I dont need to know which matched and which didnt i only need a total count. Also it would be nice if this could be done in one step for each column so i get a count for each column. To be honest i do understand what should be done even before i new what awk was BUT even with the manual it doesnt say anything about counts (not that i saw anything). You have to understand i have never programed anything before. So aside from the getline how should my idea be modified? should look like this atm: awk '{current=$NF , if($NF != current {count++})} END{print count}' but this still doesnt answer how the first line should be treated nor if this works for each column individually |
Quote:
So in the example, current is being set to the value of whatever is stored in the last field. So in your example data the first line would store the number 3 in current as it is the last field. Quote:
Code:
count++ Assuming your syntax was right, which it currently is not, if you issue the following: Code:
awk 'NR > 1 && $NF != current{count++}{current = $NF}NR == 1{next}END{print count}' file |
My Tutor pointed out i should just try things step by step and gave me the same hint as you about NF (>.<)
So i changed it slowly by trying and came to awk 'BEGIN {count=0;var=0}{if ($2!=var) count++; var=$2} END {print count-1}' input and it worked yeahi. My head hurts. So thank you all very much for your patience :D |
Quote:
|
All times are GMT -5. The time now is 04:39 AM. |