LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How to match different records based on Same Field values using awk (https://www.linuxquestions.org/questions/programming-9/how-to-match-different-records-based-on-same-field-values-using-awk-4175557932/)

Abdelrahman.fathy 11-04-2015 02:37 AM

How to match different records based on Same Field values using awk
 
Dear Linux Users,
I have data files that looks like this:
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 1cc 1cc 2af0 "" "" "" 07/15 13:27:12
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:27
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 14f 1cc 2b88 "" "" "" 07/15 13:27:18
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:27
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 2cf 1cc bfc "" "" "" 07/15 13:27:24
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 3ed 1cc ee4 "" "" "" 07/15 13:29:08
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 45d 1cc 298 "" "" "" 07/15 13:29:12
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 492 1cc b24 "" "" "" 07/15 13:29:16
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 50c 1cc 2528 "" "" "" 07/15 13:29:20
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 378 1cc 1c4c "" "" "" 07/15 13:29:23
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 2a9 1cc 1f1c "" "" "" 07/15 13:29:26
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 4c9 1cc 2970 "" "" "" 07/15 13:29:29
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 227 1cc 8a4 "" "" "" 07/15 13:29:33
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 541 1cc 22dc "" "" "" 07/15 13:29:36
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 581 1cc 2d0 "" "" "" 07/15 13:29:39
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 5c1 1cc 18e8 "" "" "" 07/15 13:29:42
DENY IamFeatureName1 1.0 IamUserName IamHostName "" 1 -18 1 07/15 13:29
OUT IamFeatureName 2013.11 9 IamUserName IamHostName "" 20 20 0 601 1cc 2844 "" "" "" 07/15 13:29:45
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 1cc 07/15 13:31:39
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 14f 07/15 13:31:41
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 2cf 07/15 13:31:42
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 3ed 07/15 13:31:43
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 45d 07/15 13:31:45
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 492 07/15 13:31:46
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 50c 07/15 13:31:47
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 378 07/15 13:31:48
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 2a9 07/15 13:31:49
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 4c9 07/15 13:31:52
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 601 07/15 13:32:09
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 5c1 07/15 13:32:13
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 541 07/15 13:32:13
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 20 0 581 07/15 13:32:13
IN 2 IamFeatureName 2013.11 IamUserName IamHostName "" 20 0 0 227 07/15 13:32:13
IN 1 results 2013.11 IamUserName IamHostName "" 20 40 0 24f 07/15 13:32:17
IN 1 results 2013.11 IamUserName IamHostName "" 20 20 0 f6 07/15 13:32:17


it's basically report log for rlms license server

I was wondering how to match two records starting was Keywords OUT and IN based on field 2(FeatureName),3(FeatureVersion),5(Username),6(Machine Name),11(Handle) starting with Out Keyword with Fields 3(FeatureName),4(FeatureVersion),5(Username),6(MachineName),11(Handle) and print matching Pairs of OUT/IN




i've written this code but it does have some issue

Code:

awk '
BEGIN{OFS=",";}
{if ($1 ~ /OUT/)
{
        OUTFeatureName[$3]=$3;
        OUTFeatureVersion[$3]=$4;
        OUTUsername[$3]=$5;
        OUTHostname[$3]=$6;
        OUTHandle[$3]=$11;
}

else if ($1 ~ /IN/)
{
        INFeatureName[$3]=$3;
        INFeatureVersion[$3]=$4;
        INUsername[$3]=$5;
        INHostname[$3]=$6;
        INHandle[$3]=$11;
}

 {
        for (var in INFeatureName )
        {
                print NR;
                if(match(INFeatureName[var],OUTFeatureName[var])>0 && match(INFeatureVersion[var],OUTFeatureVersion[var])>0 && match(INHostname[var],OUTHostname[var])>0 && match(INHandle[var],OUTHandle[var])>0 && match(INUsername[var],OUTUsername[var])>0)
                        print $0;
                  else
                    print "No records"
       
        }
 }

}' fileName


grail 11-04-2015 02:58 AM

Firstly, please use [code][/code] tags around code and data to keep them readable and scrollable.
Quote:

i've written this code but it does have some issue
And would you like to tell us the issue?

As far as the data goes, will ALL the OUT's always appear before ALL the IN's? If so, simply store all the OUT's and on hitting the first IN, start testing that the data you are looking for matches.

I am not able to follow your logic in the code 100, but what is obvious is that you store IN/OUT but then use the for loop straight away, but you may not have read all the data required (see above first)
If you cannot guarantee IN's after OUT's you will need to store all the data and then perform your for loop in the END stanza (opposite to your BEGIN you have used)

Abdelrahman.fathy 11-04-2015 04:05 AM

Dear grail,
First thanks very much,

I was wondering if the following code is better,It's still missing array indices comparsion??

but i took your point into consideration
Code:

awk '
BEGIN{OFS=",";}
{
        if ($1 ~ /OUT/)
        {
                OutRecord[$2FS$3FS$5FS$6]=$0
        }

        else if ($1 ~ /IN/)
        {
                InRecord[$3FS$4FS$5FS$6]=$0
        }

       

}END {
                for (var in InRecord )
                {
                        #print "IN record index is: "  var;
                         
                        if(match(InRecord[var],OutRecord[var])>0) # this is wrong because we are supposed to match indices not the record itself
                                {
                                        print "Printing Out: " OutRecord[var];
                                        print "Printing In: " InRecord[var];
                                }
                          else
                                print "The index did not match were: "
                }
        }'


Abdelrahman.fathy 11-04-2015 04:33 AM

also i tried this code but it's misbehaving:

Code:

awk '
BEGIN{OFS=",";}
{if ($1 ~ /OUT/)
OutRecord[$2FS$3FS$5FS$6]=$0;
else if ($1 ~ /IN/)
InRecord[$3FS$4FS$5FS$6]=$0;
}END {
for (var in InRecord )
{
for (varOut in OutRecord)
{
if (var in OutRecord )
{
#print "current var is: " var,varout
print "OUT record: " OutRecord[varOut]
print "IN record: "  InRecord[var]       
}
#else
#print "No records found between: " varOut,var
}
}
}' file


Abdelrahman.fathy 11-04-2015 04:39 AM

Maybe this is fine:

Code:

awk '
BEGIN{OFS=",";}
{if ($1 ~ /OUT/)
OutRecord[$2FS$3FS$5FS$6]=$0;
else if ($1 ~ /IN/)
InRecord[$3FS$4FS$5FS$6]=$0;
}END {
for (var in InRecord )
{
        if (var in OutRecord  )
{
#print "current var is: " var,varout
print "OUT record: " OutRecord[var]
print "IN record: "  InRecord[var]       
}

else
print "No records found for: " var
}
}' file


grail 11-04-2015 05:33 AM

So it is good to see you are iteratively getting to where you need to be :) So you have the results you need now? If so, please mark as SOLVED

Abdelrahman.fathy 11-04-2015 05:37 AM

First thanks very much, you empowered me towards it

Abdelrahman.fathy 11-08-2015 01:33 AM

Dear LinuxQuestion Community,

Can you please help me , I want to concatenate a calculated variable value(using mktime() function) to the whole record $0.

Code:

gsub(/[/]/," ",$17);
gsub(/[:]/," ",$18);
OutDate=("2015 ") $17 " " $18;
OUTTime=mktime(OutDate);
print $1FS$2FS$3FS$4FS$5FS$6FS$7FS$8FS$9FS$10FS$11FS$12FS$13FS$14FS$15FS$16FS$17$18FSOUTTime;
OutRecord[$2FS$3FS$5FS$6FS$11]=$1FS$2FS$3FS$4FS$5FS$6FS$7FS$8FS$9FS$10FS$11FS$12FS$13FS$14FS$15FS$16FS$17$18FSOUTTime;

but it does not give me the result i wanted

Abdelrahman.fathy 11-08-2015 02:11 AM

I found this as an answer,
Out=$1" "$2" "$3" "$4" "$5" "$6" "$7" "$8" "$9" "$10" "$11" "$12" "$13" "$14" "$15" "$16" "OUTTime;

but maybe , there is a better solution

grail 11-08-2015 03:28 AM

I am a little confused at what you are trying to do here? mktime function takes one argument with either 6 or 7 fields in it ("YYYY MM DD HH MM SS [DST]"), what values are in the 18 fields you are using??

Abdelrahman.fathy 11-08-2015 03:34 AM

Thanks grail For Answering.
the required arguments for Mktime is given in column 17 , 18 i am just concatenting the year and pass all values to mktime.

if you need more information about the file format
http://www.reprisesoftware.com/RLM_Enduser.html
in the section entitled "Reportlog File Format" , you will find all details about the file format i'm trying to process.
My apology for not uploading an actual file from the server,as it invades my customer privacy.

grail 11-08-2015 04:52 AM

Ok, thanks for explanation, but my next question would be, why are you re-printing all 18 fields when you could just use $0?

I see in post #9 you have removed 17 and 18, so in that case you could simply zero them after they have been used for mktime, so something like:
Code:

awk '{dspec = "2015"FS$17FS$18; NF=16;print $0,mktime(gensub(/[:/]/," ","g",dspec))}'

Abdelrahman.fathy 11-08-2015 09:58 PM

grail thanks very much,
I did remove more fields based on requirments,
so i have
Out=$1" "$2" "$3" "$5" "$6" "$8" "$11" "OUTTime;
In=$1" "$2" "$3" "$4" "$5" "$6" "$9" "$11" "INTime

Abdelrahman.fathy 11-10-2015 04:37 AM

grail or anybody can you please help me,
I found that the date and time in my file are in local time which is AST (UTC+3)
i need to convert those in my awk script to GMT

Is there are any method to do this.

grail 11-10-2015 10:20 AM

Have you read the relevant page from the manual?

http://www.gnu.org/software/gawk/man...Time-Functions


All times are GMT -5. The time now is 06:05 PM.