LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Using sed to remove lines with duplicate ID's, but different endings... (http://www.linuxquestions.org/questions/linux-newbie-8/using-sed-to-remove-lines-with-duplicate-ids-but-different-endings-806661/)

wapitismith 05-08-2010 11:08 AM

Using sed to remove lines with duplicate ID's, but different endings...
 
I have a file that contains lines representing the nodes of a polyline but I only need the first point in each segment. With the following text:

0,"013A",0.57,260739.891,4379258.87
0,"013A",0.57,260737.674,4379258.94
0,"013A",0.57,260684.628,4379258.35
1,"013A",0.545,260769.915,4379257.84
1,"013A",0.545,260739.891,4379258.87
2,"013A",1.059,259567.126,4379293.16
2,"013A",1.059,259562.637,4379302.59
2,"013A",1.059,259534.423,4379337.52
2,"013A",1.059,259460.853,4379414.3
3,"013A",1.036,259574.096,4379278.51
3,"013A",1.036,259567.126,4379293.16
4,"013A",1,259580.147,4379253.83
4,"013A",1,259574.415,4379277.84
4,"013A",1,259574.096,4379278.51
5,"013A",0.98,259581.802,4379185.53
5,"013A",0.98,259580.147,4379253.83
I would like to have this as output:

0,"013A",0.57,260737.674,4379258.94
1,"013A",0.545,260769.915,4379257.84
2,"013A",1.059,259567.126,4379293.16
3,"013A",1.036,259574.096,4379278.51
4,"013A",1,259580.147,4379253.83
5,"013A",0.98,259581.802,4379185.53
I've tried combinations of uniq and awk, and sed, but I am stumped. I'm sure I'm too close to the problem and can't find the simple solution.

The problem with uniq is that the last two colums will differ. I don't care about the x/y for any points following the first one.

Any assistance would be greatly appreciated.

~wapitismith~

colucix 05-08-2010 11:33 AM

What about this?
Code:

awk -F, '{ if ( ! ( $1$2$3 in _ )) _[$1$2$3] = $0 } END { for ( i in _ ) print _[i] }' file | sort

grail 05-08-2010 11:36 AM

Or maybe:
Code:

awk -F, '!_[$1]++' file

colucix 05-08-2010 11:40 AM

Great, grail! :hattip:

wapitismith 05-08-2010 12:30 PM

Incredilby easy! I knew I was missing something simple. very nice responses!!!

Thanks!


All times are GMT -5. The time now is 09:47 AM.