LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-23-2011, 02:57 PM   #1
sasanthi
LQ Newbie
 
Registered: Jul 2011
Posts: 11

Rep: Reputation: Disabled
compare two files and print the common lines


Hi,

I have multiple files and I want to do pairwise comparisons and print in an output file the common lines. For example two files might be:

file1
http://www.broadinstitute.org/gsea/m...ETABOLISM.html
HAAO
ECHS1
AOX1
GCDH
AANAT
KYNU
AFMID

file2
AANAT
ALDH9A1
IDO1
TPH1
KYNU
MAOA
OGDH
ACAT1
AFMID

and I want to compare them and print the common lines. I use the following code:

awk '
BEGIN {
while ( getline < "file1" ) { arr[$2]=$0}
}

END {
for( key in arr )
print arr[key]
} 'file2

but does not work. Could someone help me?

thanks a lot!
 
Old 07-23-2011, 03:40 PM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Your code has only a BEGIN and an END part. This means that only the last line is actually processed, whereas the other ones are simply read and ignored. Try something like this instead:
Code:
awk 'BEGIN { while ( getline < "file1" ) arr[$0]++ }( $1 in arr )' file2
In alternative you can try the comm command. See man comm for details.
 
Old 07-23-2011, 04:13 PM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,496

Rep: Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867
I agree, why not sort the files and use comm or diff
 
Old 07-23-2011, 04:26 PM   #4
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 721Reputation: 721Reputation: 721Reputation: 721Reputation: 721Reputation: 721Reputation: 721
Code:
uniq -u
 
Old 07-25-2011, 06:12 PM   #5
sasanthi
LQ Newbie
 
Registered: Jul 2011
Posts: 11

Original Poster
Rep: Reputation: Disabled
that works! thank you for your help!

however, I have multiple files that I want to compare in once. I wrote the following script:

#!/bin/bash

for name1 in *.genes.txt
do

for name2 in *.genes2.txt
do

awk 'BEGIN { while ( getline < "$name1" ) arr[$0]++ }( $1 in arr )' $name2 > common/$name1.$name2.com


done
done


to do it automatically but does not work. Why? Any ideas?

thanks
 
Old 07-25-2011, 06:18 PM   #6
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
The awk command is placed inside single quotes: this prevents the shell expansion of $name1. If you want to pass a shell variable inside awk, use the -v option:
Code:
awk -v file=$name1 'BEGIN { while ( getline < file ) arr[$0]++ }( $1 in arr )' $name2
 
1 members found this post helpful.
Old 07-25-2011, 06:28 PM   #7
sasanthi
LQ Newbie
 
Registered: Jul 2011
Posts: 11

Original Poster
Rep: Reputation: Disabled
many thanks!! that works fine now!
 
Old 07-26-2011, 01:18 PM   #8
DBabo
Member
 
Registered: Feb 2003
Distribution: Scientific Linux 6, Fedora
Posts: 456

Rep: Reputation: 37
Quote:
Originally Posted by colucix View Post
The awk command is placed inside single quotes: this prevents the shell expansion of $name1. If you want to pass a shell variable inside awk, use the -v option:
Code:
awk -v file=$name1 'BEGIN { while ( getline < file ) arr[$0]++ }( $1 in arr )' $name2
brilliant - that's what i was looking for too. Thank you.

P.S. I just hope it will work on Solaris .. .ehhh.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] compare 2 text files and change lines DBabo Linux - Newbie 6 07-26-2011 01:20 PM
How to compare/diff a range of lines from two text files jedibrand Linux - Software 1 03-26-2010 01:54 PM
bash- how to compare only certain lines of text files daberkow Linux - Newbie 2 06-01-2009 04:48 PM
Perl Script to select common lines in two files. perluser59 Programming 12 05-26-2008 02:19 AM
Using diff to compare file with common lines, but at different line numbers jimieee Linux - Newbie 3 05-10-2004 07:26 AM


All times are GMT -5. The time now is 10:03 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration