LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 09-13-2008, 03:39 PM   #1
visitnag
Member
 
Registered: Mar 2008
Posts: 147

Rep: Reputation: 15
compare two files using Awk!!


I have to files. In the first file I have a unique emp.id as first field and in the second file I have the same emp.id as second field. I want join (or match) these two files on the basis of emp.id (i.e., first file with second file.) and print the matched lines with all the fields of both the files.

Thank you in advance.
 
Old 09-13-2008, 05:47 PM   #2
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 59
Show some examples of what you want.

Please be more clear about the questions you ask. Many of your questions seem rather lazy, with minimal though or effort. You'll get more if you spend more time formulating "intelligent" questions.

See: http://www.catb.org/~esr/faqs/smart-questions.html
 
Old 09-13-2008, 05:56 PM   #3
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
This sounds like the source of these files are tables from a relational database such as mysql. The emp.id field in the first file sounds like emp is the employee table, with "id" being the primary key; and in the second file, emp.id is a foreign key.
example:

If that is the case, why not use a database client (such as the mysql command) and enter the sql SELECT command to return the information from the database containing the information you want. To paraphrase the evil Russian General from the movie Golden Eye, "Use the database! That's what it's for."

example:
Code:
SELECT tags.title, album.id, album.name
FROM tags INNER JOIN album 
ON album.id = tags.album 
LIMIT 10;
+-----------------------+-----+----------------------------------------------------+
| title                 | id  | name                                               |
+-----------------------+-----+----------------------------------------------------+
| LQ Podcast - 01.24.07 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 05.09.08 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Radio Episode #3   | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 10.29.07 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 04.20.06 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 06.22.06 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 05.18.07 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 10.22.06 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 01.17.08 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
| LQ Podcast - 11.09.05 | 339 | LQ Radio - Open Talk...About Open Source and Linux |
+-----------------------+-----+----------------------------------------------------+
---

Generally a lot depends on the particular of your two files. Is there a one-to-one relationship between the emp.id's in the two file? Is every id in one file present in the other and vice versa? If so, then you could sort each file and input the results to the paste command.

Code:
head table1 table2
==> table1 <==
1       A
2       AA
3       AAA
4       Aaberg
5       Aaberg's
6       Aachen
7       Aachen's
8       aah
9       aahed
10      aahing

==> table2 <==
zinked  1
zinkenite       2
zinkenites      3
zinkenite's     4
zinkes  5
zinkiferous     6
zinkification   7
zinkified       8
zinkifies       9
zinkify 10
Look at the output of the paste command before using cut to select the column(s) you want. Make sure that the ids match up.
Code:
 paste <(sort -n -k1 table1) <(sort -n -k2 table2) | head
1       A       zinked  1
2       AA      zinkenite       2
3       AAA     zinkenites      3
4       Aaberg  zinkenite's     4
5       Aaberg's        zinkes  5
6       Aachen  zinkiferous     6
7       Aachen's        zinkification   7
8       aah     zinkified       8
9       aahed   zinkifies       9
10      aahing  zinkify 10
Since the ids match up, it's ok to use cut.
Code:
paste table1 <(sort -n -k2 table2 | cut -d'    ' -f1) >table3
The sort, paste and cut commands are all provided by coreutils. Look though the coreutils manual. I downloaded the source for this package and for gawk and ran "./configure && makepdf" to produce a print worthy pdf document. The GAWK: Effective AWK Programming book is excellent. The coreutils package supplies 101 commands and 98 manpages. Having a printed version of the coreutils info manual is a great resource.

At least browse through the commands & the brief topics in the index page by entering "info coreutils".

---

To do what you want to do in a single awk script, you may need to use "readline" to read from table2 while processing a line from table1.

If the same one-to-one conditions exist, then you can use paste to assemble a long line from each table. Then in awk, you can check that the two fields containing the id match up and if they do, print out the fields you want.

Last edited by jschiwal; 09-13-2008 at 06:49 PM.
 
Old 09-15-2008, 12:42 PM   #4
visitnag
Member
 
Registered: Mar 2008
Posts: 147

Original Poster
Rep: Reputation: 15
file1.

emp.id name quantity amount
11111 xyx 25 5500
22222 sss 55 7500
33333 wes 120 8898


file2.

add1 add2 add3 empid pin
rd1 blah blah 11111 2324
rd2 blah blah 22222 5565
rd4 blah blah 33333 6548

I want to match above files on emp.id fields using awk.

Thank you all.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
compare two files haydar68 Programming 8 08-16-2008 03:36 PM
AWK: compare two files haydar68 Programming 6 08-02-2008 11:20 PM
Compare files in specific directory to files on CD szim90 Linux - Newbie 3 02-10-2007 08:39 PM


All times are GMT -5. The time now is 10:01 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration