LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Reading difference (https://www.linuxquestions.org/questions/linux-newbie-8/reading-difference-4175430572/)

oliviaxinw 10-04-2012 06:49 PM

Reading difference
 
Hi,

I have a file that looks like this:
Basically, in column 1, for every pair of two lines, the number before "/" are the same.
I want to be able to pull out all the lines in which the column2 is different for the pair of two lines.
For example, in the lines above, I would want to have a output file that prints the last two lines because 258975 and 258975 are two different numbers for 254879/1 and 254879/2. What code should I use?

Column1 Column2
123456/1 123456
123456/2 123456
235648/1 234567
235648/2 234567
254879/1 258975
254879/2 265897
.... .....

Thank you.

kabamaru 10-05-2012 03:05 PM

Code:

awk -F / '{ print $1, $2 }' FILENAME | uniq -f2 -u | awk ' { print $1 "/" $2, $3 }'

kabamaru 10-05-2012 03:40 PM

And to explain this...

awk -F / '{ print $1, $2 }' FILENAME uses '/' as field separator and prints lines like this:

Code:

123456 1 123456
123456 2 123456
235648 1 234567
235648 2 234567
254879 1 258975
254879 2 265897

Then uniq -f2 -u prints only lines that have no duplicates, ignoring the first 2 fields:

Code:

254879 1 258975
254879 2 265897

And awk ' { print $1 "/" $2, $3 }' restores the '/1' or '/2' part:

Code:

254879/1 258975
254879/2 265897


kabamaru 10-05-2012 04:41 PM

Hmm, now that I think of it, this won't work as expected if two pairs of lines have the same number in second column...

If the '/1' or '/2' part doesn't need to be displayed, you can use this command:

Code:

sed 's/\/[12]//' FILENAME | uniq -u

kabamaru 10-05-2012 05:03 PM

Ok, found one that will do the job:

Code:

sed 's/\([0-9]*\)\(\/[12]\) \([0-9]*\)/\2 \1 \3/' FILENAME \
  | uniq -f1 -u \
  | awk '{ print $2 $1, $3 }'


David the H. 10-07-2012 06:09 PM

Code:

awk 'NR%2==1 {a=$2;b=$0} ; NR%2==0 && $2!=a {print b"\n"$0}' infile
On odd-numbered lines, the second field is stored in variable a, and the whole line into variable b.

On even numbered lines, the second field is compared to the previous value (a), and if different, the last line (b) and the current line are printed.


All times are GMT -5. The time now is 03:58 PM.