LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   help on awk contain between two columns (https://www.linuxquestions.org/questions/programming-9/help-on-awk-contain-between-two-columns-4175456683/)

phpshell 04-03-2013 07:28 AM

help on awk contain between two columns
 
file.txt
AD615 13J04084706 53H04094706 0-000-0000 HUWSAFRA-615-J:1-0-13-53
ANOZM 15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
MDM1G 09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
ABAH9 15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6
AD417 08H11114504 20Z06044504 0-000-0000 IP417SULT-H:1-1-6-20
AD106 05P02123801 09Z02133801 0-000-0000 IP106ULYA-P:1-1-3-9


output should be
ANOZM 15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
MDM1G 09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
ABAH9 15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6


take $1 as main data then serach contain on $5

$1 have
ANOZM
MDM1G
ABAH9


already contain on $5
HUWM219-00-209_ANOZM:1-0-15-1
HUWM301-00-411_MDM1G:1-0-9-27
MRGDNBKAA_ABAH9-A:1-1-15-6


awk '$1~/$5/ {print $0}' file.txt

but it does not show me what i need ?
anybody can help me

danielbmartin 04-03-2013 08:06 AM

Try this ...
Code:

awk '$5~$1' $InFile >$OutFile
Daniel B. Martin

phpshell 04-03-2013 12:08 PM

Thanks Daniel

phpshell 04-16-2013 02:18 AM

another question in same think
if I have two files need to make comparison between them based on contain text how i can do such as this

file 1

13J04084706 53H04094706 0-000-0000 HUWSAFRA-615-J:1-0-13-53
15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6
08H11114504 20Z06044504 0-000-0000 IP417SULT-H:1-1-6-20
05P02123801 09Z02133801 0-000-0000 IP106ULYA-P:1-1-3-9


file 2
ANOZM
MDM1G
ABAH9


output should be like this
ANOZM 15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
MDM1G 09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
ABAH9 15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6

druuna 04-16-2013 02:44 AM

Have a look at this:
Code:

#!/bin/bash

awk '
BEGIN {
  while ( ( getline < "file2" ) > 0 )
    { _[$1] = $1 }
}
{
for ( item in _ )
    if ( $0 ~ _[item] ) { print _[item], $0}
}' file1

Example run with input shown in post #4:
Code:

./awk.phpshell.sh
ANOZM 15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
MDM1G 09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
ABAH9 15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6


danielbmartin 04-16-2013 07:12 AM

Quote:

Originally Posted by phpshell (Post 4932235)
another question in same think
if I have two files need to make comparison between them based on contain text how i can do such as this

file 1

13J04084706 53H04094706 0-000-0000 HUWSAFRA-615-J:1-0-13-53
15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6
08H11114504 20Z06044504 0-000-0000 IP417SULT-H:1-1-6-20
05P02123801 09Z02133801 0-000-0000 IP106ULYA-P:1-1-3-9


file 2
ANOZM
MDM1G
ABAH9


output should be like this
ANOZM 15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
MDM1G 09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
ABAH9 15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6

Try this ...
Code:

grep -Ff $InFile2 $InFile1 >$OutFile
Daniel B. Martin

phpshell 04-16-2013 07:39 AM

Quote:

Originally Posted by druuna (Post 4932244)
Have a look at this:
Code:

#!/bin/bash

awk '
BEGIN {
  while ( ( getline < "file2" ) > 0 )
    { _[$1] = $1 }
}
{
for ( item in _ )
    if ( $0 ~ _[item] ) { print _[item], $0}
}' file1

Example run with input shown in post #4:
Code:

./awk.phpshell.sh
ANOZM 15A01011902 01H01011902 0-000-0000 HUWM219-00-209_ANOZM:1-0-15-1
MDM1G 09A01010812 27H01010812 0-000-0000 HUWM301-00-411_MDM1G:1-0-9-27
ABAH9 15A07010408 06S02010308 0-000-0000 MRGDNBKAA_ABAH9-A:1-1-15-6




thanks for your help ...
but this very difficult to trace the code for me .any other way


Dear Daniel
i try grep -f but it show me these messages

grep: Memory exhausted
grep: memory exhausted: Cannot allocate memory

druuna 04-16-2013 07:53 AM

Quote:

Originally Posted by phpshell
thank you but this very difficult to trace the code for me

Here's an explanation of the code:
Code:

awk '
BEGIN {
  while ( ( getline < "file2" ) > 0 )
    { _[$1] = $1 }
}

{
for ( item in _ )
    if ( $0 ~ _[item] ) { print _[item], $0}
}
' file1

The blue part is done once when awk starts. It puts the entries found in file2 into an array (array name is _).
Once file2 is processed awk will start reading file1, one line at the time.

Each line from file1 is checked against the array entries, if it matches it prints both the content of the array and the line from file1

Quote:

Originally Posted by phpshell
any other way

danielbmartin's edited solution works on my side.

But I'm only testing with the examples given by you. Looking at the error you posted I can only assume that your real files are much bigger and may cause a memory problem. But that's an assumption on my side, we won't know for sure until you give some extra info.

danielbmartin 04-16-2013 09:40 AM

Quote:

Originally Posted by phpshell (Post 4932377)
Dear Daniel
i try grep -f but it show me these messages

grep: Memory exhausted
grep: memory exhausted: Cannot allocate memory

In your small example files every match is preceded by an underscore and followed by a colon. Will this always be true?

Daniel B. Martin

grail 04-16-2013 10:42 AM

You can even make the 'if' a little simpler if you check against the index:
Code:

awk 'FNR==NR{_[$0];next}{for(i in _)if($NF ~ i)print i,$0}' file2 file1

phpshell 04-17-2013 12:33 AM

I am really appreciated your support all of you guys


Marked this thread as solved :)


All times are GMT -5. The time now is 03:12 PM.