LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 11-12-2011, 11:58 PM   #1
needhelp12
LQ Newbie
 
Registered: Nov 2011
Posts: 1

Rep: Reputation: Disabled
Question Comparing lines in two files?


Dear Linux users,

I'm new to linux/coding and have a question. Although my problem seems intuitively very straightforward, I searched in these forums and Google but could not find anything specific to what I'm looking for.

Problem: I have two files, both very large (tab delimited).

File 1 looks like the following (example):
Quote:
Jane 10001
Joseph 10011
Dan 01100
David 10111
File 2 looks like:
Quote:
Joseph 145 Thistle Way Boston Ma
Dan 23 Mountain Dew Scranton Pa
Terrance 123 Green Tree Portland Or
Jane 244 Yellow Desert Berkeley Ca
Jerry 13 Purple Street Houston Tx
David 544 Leaf Valley Bangor Me
So, the lines of both Files 1 and 2 both begin with a name, but both are not in any order. What I want to do is compare the Names of Files 1 and 2 and print the lines in File 2 with Names matching the Names in File 1 into File 3, such that:
1) Lines in File 2 with Names that do not occur in File 1 are not outputted to File 3;
2) Lines in File 2 with Names that DO occur in File 1 are outputted to File 3;
3) Lines in File 3 are in same order as the Names from File 1.

For example, File 3 would look like:

Quote:
Jane 244 Yellow Desert Berkeley Ca
Joseph 145 Thistle Way Boston Ma
Dan 23 Mountain Dew Scranton Pa
David 544 Leaf Valley Bangor Me
Are there any straight forward ways of doing this in Linux without coding? Or would I have to script... in something like Perl? :S

Thanks for your help!

Needhelp12
 
Old 11-13-2011, 12:16 AM   #2
ukiuki
Senior Member
 
Registered: May 2010
Location: Planet Earth
Distribution: Debian
Posts: 1,030

Rep: Reputation: 380Reputation: 380Reputation: 380Reputation: 380
Hi there, you will need some sort of script to get that output you want, you might find some useful info here:http://linuxcommand.org/wss0010.php

Regards
 
Old 11-13-2011, 12:24 AM   #3
Mr. Bill
Member
 
Registered: Mar 2011
Location: Maryland, USA
Distribution: Xubuntu 14.04 - 64
Posts: 185

Rep: Reputation: 14
+1

Great link, btw.
 
Old 11-13-2011, 12:48 AM   #4
Brains
Member
 
Registered: Apr 2009
Distribution: Debian testing
Posts: 258

Rep: Reputation: 42
I played with your examples, here's how I did it with these two commands that generate two more files, file 4.txt is the result.
Code:
sed 's/[0-9]*//g' 1.txt > 3.txt
grep -f 3.txt 2.txt > 4.txt
The first sed command removes digits from file 1.txt and outputs file 3.txt with just the names, it does not change file 1.txt, make sure to check file 3.txt to make sure the cursor is directly behind the last letter of the last name in the file to make sure there is no space after the name before running the grep command by placing cursor there to check, no empty line below last name. The grep command will grep the names in file 3.txt against file 2.txt and output file 4.txt with every line matching the names.

Last edited by Brains; 11-13-2011 at 12:53 AM. Reason: clarify
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing lines in 2 text files? Raveolution Linux - General 6 11-06-2010 04:59 PM
comparing files newbiesforever Linux - Software 3 07-07-2010 04:20 PM
Deleting lines based on comparing fields..... OldGaf Programming 2 02-22-2008 08:04 AM
Shell script for comparing certain lines in two files mou5e Linux - Newbie 9 06-06-2007 02:40 PM
Comparing 2 Files xianzai Programming 2 05-23-2004 12:50 PM


All times are GMT -5. The time now is 03:13 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration