LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-01-2011, 09:47 AM   #1
Thaidog
Member
 
Registered: Sep 2002
Location: Hilton Head, SC
Distribution: Gentoo
Posts: 637

Rep: Reputation: 32
Question Need a shell script to filter out unique hostnames from two text files.


I have two huge lists of server names that I need to find the unique servers in. These lists are jumbled up and the hostnames are not in order so a diff will not work by itself... and ideas on how to best tackle this?
 
Old 11-01-2011, 10:07 AM   #2
thesnow
Member
 
Registered: Nov 2010
Location: Minneapolis, MN
Distribution: Ubuntu, Red Hat, Mint
Posts: 172

Rep: Reputation: 56
Can you post samples/examples of the file(s)?

If it is just one column of server names, you could use "sort" piped to "uniq" to get shorter, ordered lists.
 
Old 11-01-2011, 10:19 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I would also ask what have you tried? This seems to be a fairly trivial task on the surface unless you have missed some further details?
 
Old 11-01-2011, 12:19 PM   #4
Thaidog
Member
 
Registered: Sep 2002
Location: Hilton Head, SC
Distribution: Gentoo
Posts: 637

Original Poster
Rep: Reputation: 32
The two lists would look something like:

list 1
hostname 2
hostname abc
hostname 1

List 2
hostname 1
hostname 2

In this case I would only need output for hostname abc - if I used diff I would get:


$ diff testhost1 testhost2
1,2c1,3
< hostname 1
< hostname 2
\ No newline at end of file
---
> hostname 2
> hostname abc
> hostname 1
\ No newline at end of file

And if I sort the file I would still get issues with diff if there are more servers in one list - which there is...
 
Old 11-01-2011, 12:41 PM   #5
thesnow
Member
 
Registered: Nov 2010
Location: Minneapolis, MN
Distribution: Ubuntu, Red Hat, Mint
Posts: 172

Rep: Reputation: 56
Code:
[root@lm:~/testing]$ cat list1
hostname 2
hostname abc
hostname 1
[root@lm:~/testing]$ cat list2
hostname 1
hostname 2
[root@lm:~/testing]$ cat list1 list2 | sort | uniq -u
hostname abc
 
Old 11-01-2011, 12:48 PM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
uniq -u = print only unique lines

Code:
cat file1 file2 | sort | uniq -u
Edit: dangnabit! Beaten to the answer...

Last edited by David the H.; 11-01-2011 at 12:49 PM.
 
Old 11-01-2011, 01:58 PM   #7
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
@OP: If your sample is representative then the suggested solutions will do fine. However, if your actual data also has varying elements, e.g. timestamps etc., and you want to ignore those fields when checking for *equal* lines then you might have to consider a slightly different solution. What I mean is, consider the following data:
Code:
$ cat file1
hostname1 [22:22]
hostname2 [23:23]
hostname3 [00:00]
$ cat file2
hostname1 [22:00]
hostname3 [00:30]
Now, if you do not care about the timestamps and only hostname2 should be printed then, e.g., you could do for the above data:
Code:
$ cat file1 file2 |sort|rev| uniq -f 1 -u|rev
hostname2 [23:23]
If you have a variable number of columns then you'd probably have to switch to an 'awk' solution. Let us know your exact requirement.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Shell script for read user data with emptyLines in a text file and filter them srimal Linux - Newbie 7 11-01-2009 04:37 AM
Shell script to read lines in a text file and filter user data srimal Linux - Newbie 5 10-21-2009 07:41 AM
how can I differentiate two large text files using shell script? Files are like below surya_gadde Linux - Software 1 01-20-2009 02:52 AM
Shell Script - filter list eluzi Linux - Software 3 03-17-2006 06:06 PM
How to find and change a specific text in a text file by using shell script Bassam Programming 1 07-18-2005 07:15 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration