LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 12-09-2011, 01:33 AM   #1
elonden
LQ Newbie
 
Registered: Sep 2011
Location: Melbourne
Posts: 5

Rep: Reputation: Disabled
AWK Comparing two files


Hello,

I've been banging my head for a day now but it seems I'm doing something really wrong.

I have two files :

1. file called "pl" containing entries like:
64a240
64a340
64a440
64a540
64a640
64a740
64a840
64a8c0
64a940


2. file called "ns" containing entries like:
645b00,20:11:00:21:5a:2f:22:64
645b02,50:06:0b:00:00:c2:a2:2a
645b04,50:06:0b:00:00:c2:a2:3a
645b06,50:06:0b:00:00:c2:a2:32
645b07,50:06:0b:00:00:c2:a2:16
645c00,21:00:00:e0:8b:1e:9b:25
645d00,50:06:0e:80:05:b0:8c:56


What I want it to check if an entry in file #1 exists in field #1 of file #2 and if NOT either print the output or store it into an array.

Any feedback would be welcome.

What i have is the following (and obviously doesn't work :-)):

Code:
        awk 'BEGIN { FS=",";
                while ((getline < "ns") > 0)
                fcid=$1
                wwn=$2
                nsvar[fcid]=wwn
                close("ns")
       }        
              
        {

        while ((getline plvar < "pl") > 0) {

                if (plvar in nsvar) {
                                print("OK");
                                close(plvar)}
                else        {
                                print("Not OK");
                                close(plvar)}                   

                }

        }'
 
Old 12-09-2011, 02:09 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
If the given pattern only exists once on the line, how about this instead?

Code:
grep -v -f ./pl ./ns
 
Old 12-09-2011, 10:23 AM   #3
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
elonden's awk script is pretty close, actually. Just switch the two files, and it becomes pretty simple.

Code:
awk -v keyfile=pl '
    BEGIN {
        FS = "," ;

        while ((getline < keyfile) > 0)
            if ($1 != "")
                keys[$1] = 1 ;

        close(keyfile)
    }

    ($1 in keys) { print $0 }

    ' ns
I added the semicolons so you can put it all on one line if you want.

The BEGIN rule reads in the keys from keyfile (pl), and creates an associative array out of them. The important bit is that you populate the keys in the keys array. (The value is irrelevant here, but often useful. You might use e.g. keys[$1]=++nkeys instead if you later extend the script and need to keep track of which key caused the record to be output.)

The rule ($1 in keys) is considered for each input record (line) of the ns file. It applies to all records where the first field matches one of the keys in the keys array. The body of the rule just prints the entire record.

If your input files may contain any newline convention (i.e. they may be produced in various operating systems, including Windows and old Macs), and you wish to retain that convention, extend the script a bit, and use GNU awk (gawk):
Code:
gawk -v keyfile=pl '
    BEGIN {
        RS = "[\n\r]+" ;
        FS = "," ;
        RT = "\n" ;

        while ((getline < keyfile) > 0)
            if ($1 != "")
                keys[$1] = 1 ;

        close(keyfile)
    }

    ($1 in keys) { printf("%s%s", $0, RT) }

    ' ns
The same script works for other awk variants too, but they do not retain the newline convention (output will always use UNIX newlines, "\n"), only gawk does. (GNU awk provides an automatic variable RT, which contains the pattern that matched the record separator RS for the current record. Other awk variants treat RT as a normal variable.)
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk comparing first columns of two files anapaula Programming 1 11-17-2011 04:50 PM
problem while comparing awk field variable with input variable entered using keyboard vinay007 Programming 12 08-23-2011 01:44 AM
[SOLVED] Comparing two fields in two files using Awk. Tauro Linux - Newbie 16 07-21-2011 01:47 AM
comparing files newbiesforever Linux - Software 3 07-07-2010 04:20 PM
awk comparing a column value with a stored variables value bugg_deccan Programming 4 12-05-2008 08:08 AM


All times are GMT -5. The time now is 01:20 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration