Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to
LinuxQuestions.org , a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free.
Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please
contact us . If you need to reset your password,
click here .
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
05-08-2010, 11:08 AM
#1
LQ Newbie
Registered: May 2009
Posts: 3
Rep:
Using sed to remove lines with duplicate ID's, but different endings...
I have a file that contains lines representing the nodes of a polyline but I only need the first point in each segment. With the following text:
0,"013A",0.57,260739.891,4379258.87
0,"013A",0.57,260737.674,4379258.94
0,"013A",0.57,260684.628,4379258.35
1,"013A",0.545,260769.915,4379257.84
1,"013A",0.545,260739.891,4379258.87
2,"013A",1.059,259567.126,4379293.16
2,"013A",1.059,259562.637,4379302.59
2,"013A",1.059,259534.423,4379337.52
2,"013A",1.059,259460.853,4379414.3
3,"013A",1.036,259574.096,4379278.51
3,"013A",1.036,259567.126,4379293.16
4,"013A",1,259580.147,4379253.83
4,"013A",1,259574.415,4379277.84
4,"013A",1,259574.096,4379278.51
5,"013A",0.98,259581.802,4379185.53
5,"013A",0.98,259580.147,4379253.83
I would like to have this as output:
0,"013A",0.57,260737.674,4379258.94
1,"013A",0.545,260769.915,4379257.84
2,"013A",1.059,259567.126,4379293.16
3,"013A",1.036,259574.096,4379278.51
4,"013A",1,259580.147,4379253.83
5,"013A",0.98,259581.802,4379185.53
I've tried combinations of uniq and awk, and sed, but I am stumped. I'm sure I'm too close to the problem and can't find the simple solution.
The problem with uniq is that the last two colums will differ. I don't care about the x/y for any points following the first one.
Any assistance would be greatly appreciated.
~wapitismith~
05-08-2010, 11:33 AM
#2
Moderator
Registered: Sep 2003
Location: Bologna
Distribution: OpenSUSE 12.1 CentOS 6.2
Posts: 9,003
What about this?
Code:
awk -F, '{ if ( ! ( $1$2$3 in _ )) _[$1$2$3] = $0 } END { for ( i in _ ) print _[i] }' file | sort
05-08-2010, 11:36 AM
#3
Guru
Registered: Sep 2009
Location: Perth
Distribution: Mint
Posts: 5,402
Or maybe:
Code:
awk -F, '!_[$1]++' file
05-08-2010, 11:40 AM
#4
Moderator
Registered: Sep 2003
Location: Bologna
Distribution: OpenSUSE 12.1 CentOS 6.2
Posts: 9,003
Great, grail!
05-08-2010, 12:30 PM
#5
LQ Newbie
Registered: May 2009
Posts: 3
Original Poster
Rep:
Incredilby easy! I knew I was missing something simple. very nice responses!!!
Thanks!
Thread Tools
Search this Thread
Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
All times are GMT -5. The time now is 04:05 PM .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know .
Latest Threads
LQ News