LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 10-13-2012, 12:06 AM   #1
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,797
Blog Entries: 4

Rep: Reputation: 285Reputation: 285Reputation: 285
Get only uniq content from a file


I have a very large log file, containing more than 5 lac entries of different host IPs. I want to get only unix values out the list of IPs, but uniq command is not helping me.
I used more <logfile> | uniq -u but it still giving me repeated lines in it i.e. same IP addresses are showing repeatedly in the output. So can anybody help in this?
Thanks a lot.
 
Old 10-13-2012, 02:28 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Without knowing what the entries actually look like it will be hard to point you to a working solution.

If using uniq doesn't solve the problem then I have to assume that the lines are not the same. The IP addresses might be, but other info on that line differs. One thing that comes to mind first: Is there a time-stamp present in those lines?

Like I already mentioned; Without more info (what do the lines look like, which entries should be made (in)visible etc) we cannot assist you.
 
Old 10-13-2012, 02:54 AM   #3
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,797
Blog Entries: 4

Original Poster
Rep: Reputation: 285Reputation: 285Reputation: 285
Quote:
Originally Posted by druuna View Post
Without knowing what the entries actually look like it will be hard to point you to a working solution.

If using uniq doesn't solve the problem then I have to assume that the lines are not the same. The IP addresses might be, but other info on that line differs. One thing that comes to mind first: Is there a time-stamp present in those lines?

Like I already mentioned; Without more info (what do the lines look like, which entries should be made (in)visible etc) we cannot assist you.
Suppose, some hosts connect to my server, and my server maintains a log file which records IP address of each host connect to it. Now the situation is that, a host can connect to my server many times and each time server notes it IP address in it's log file. So the task is, though log file contains many entries of same IP, and I want to list of all IPs only once. For example:
10.199.1.2
10.199.1.3
10.199.1.4
10.199.1.5
10.199.1.2
10.199.1.3
10.199.1.1
10.199.1.2
10.199.1.2
10.199.1.1

And I want:
10.199.1.2
10.199.1.3
10.199.1.4
10.199.1.1

But unfortunately, uniq is not helping me. Meanwhile I have got a solution by using sort -u filter. If you know any other way, please suggest.
 
Old 10-13-2012, 03:11 AM   #4
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
The reason why uniq doesn't work as expected by you; The following is from the uniq man page:
Quote:
Note: 'uniq' does not detect repeated lines unless they are adjacent.
You may want to sort the input first, or use `sort -u' without `uniq'.
The following would have worked:
Code:
sort logfile | uniq -u
But, as you already figured out, sort can do both and the uniq part isn't needed:
Code:
sort -u logfile
 
Old 10-13-2012, 04:46 AM   #5
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,438

Rep: Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879
Or an awk alternative:
Code:
awk '!_[$0]++' file
 
Old 10-13-2012, 08:46 AM   #6
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,797
Blog Entries: 4

Original Poster
Rep: Reputation: 285Reputation: 285Reputation: 285
Quote:
Originally Posted by grail View Post
Or an awk alternative:
Code:
awk '!_[$0]++' file
Awk is magical, it always works!!
Thanks everyone. It's solved.
 
  


Reply

Tags
command


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] CUT | SORT | UNIQ -D | Line number of original file? mannoj87 Linux - Newbie 13 04-22-2012 08:54 AM
Write a script which copies content of file 1 to file 2 without using cp command. aashka Linux - Newbie 5 04-10-2012 03:55 PM
Dividing content of one file by content of another larspend Linux - Newbie 5 04-12-2011 08:00 PM
how to find a file with uniq extension abhigrkist Programming 5 12-22-2009 02:16 AM
Use uniq on first part of file but print whole line. snowman81 Programming 4 10-03-2009 06:22 AM


All times are GMT -5. The time now is 06:05 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration