LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 09-19-2011, 05:31 AM   #1
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Rep: Reputation: Disabled
exact match in awk


Dear Experts,

I have a file including number "78" and "178" at different unknown positions in the text.

I want to match the line including "78" and do something to strings in that line.

When I use
Code:
awk 'match($0, "78"){action}'FILENAME
I always got the action applied on both lines including "78" and "178".

I should not use pattern
Code:
$Filed == "78"
for searching, as the position of "78" was not known in advanced.

So, how can I match exactly "78" but not "178" with "match"?

Any help would be greatly appreciated. Thanks a lot for your time!

Last edited by cristalp; 09-19-2011 at 05:46 AM.
 
Old 09-19-2011, 05:46 AM   #2
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
You can try word boundaries. In awk they are specified by the \y operator and you need to use a regular expression (enclosed in slashes) instead of a string constant:
Code:
awk 'match($0, /\y78\y/)' file
 
Old 09-19-2011, 06:04 AM   #3
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colucix View Post
You can try word boundaries. In awk they are specified by the \y operator and you need to use a regular expression (enclosed in slashes) instead of a string constant:
Code:
awk 'match($0, /\y78\y/)' file
Thanks for the help. I tried your code, and I added print action, but I can not get any output. I do not what would be wrong? Thanks anyway.
 
Old 09-19-2011, 06:28 AM   #4
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
OK, I find a solution myself like:

Code:
awk 'match($0, /^78/) {action}' FILENAME
or
Code:
awk '/^78/ {action}' FILENAME
I've made several tests, and seems all right.

But now if I want to incorporate this code in a bash script. The 78 will be replaced by a variable, say x, according to my goal.

Then I tried
Code:
x=78
awk -v y=$x 'match($0, /^y/) {action}' FILE
and
Code:
x=78
awk -v y=$x '/^y/ {action}' FILE
and even just:
Code:
awk -v y=$x '/y/ {action}' FILE
None of them works. The reason is that the "/" makes the variable not recognizable by awk anymore. So, now, how could I avoid the "/" problem and use variable in exact matching?

Sorry for the further more questions. I really hope to solve it completely and makes things clear. Any help would be appreciated!

Last edited by cristalp; 09-19-2011 at 06:32 AM.
 
Old 09-19-2011, 08:13 AM   #5
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Try to keep $x outside single quotes:
Code:
awk 'match($0, /^'$x'/)' FILE
 
1 members found this post helpful.
Old 09-19-2011, 09:48 AM   #6
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,692

Rep: Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987
So I find it interesting in your first post you say the position of 78 is unknown and yet the solution is to say it is at the start of the line??? I would consider this
knowing the position.

Also have you considered (or maybe you do not need to) what happens if you have 785 at the start of a line?

Lastly, match is not required as you save no data of the match so a computed regex will do fine:
Code:
awk -v y=$x '$0 ~ "^"y{action}' FILE
Or maybe if it is only 78 and your delimiter is next (below assumes default FS):
Code:
awk -v y=$x '$1 == y{action}' FILE
 
Old 09-19-2011, 10:03 AM   #7
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
perfect answer

Quote:
Originally Posted by grail View Post
So I find it interesting in your first post you say the position of 78 is unknown and yet the solution is to say it is at the start of the line??? I would consider this
knowing the position.

Also have you considered (or maybe you do not need to) what happens if you have 785 at the start of a line?

Lastly, match is not required as you save no data of the match so a computed regex will do fine:
Code:
awk -v y=$x '$0 ~ "^"y{action}' FILE
Or maybe if it is only 78 and your delimiter is next (below assumes default FS):
Code:
awk -v y=$x '$1 == y{action}' FILE
Hi grail, thanks for your detailed answer and kind help! No, 78 is not at the start of the line. For some file, it is in the middle, for some other files, it might be at the start or end. I need a general way to locate the line which includes a "78".

785 might also possibly appeared, so I also need to avoid it. I just only need the line have "78", it can have any other characters which I do not care as long as "78" is there. That's why I need EXACT match.

Thanks a lot for your code. The first one is exactly what I am looking for! Works like a champ and perfectly fit the whole script so that I do not have to change anything else. Thanks a lot!
 
Old 09-19-2011, 10:16 AM   #8
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,692

Rep: Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987Reputation: 1987
I am happy you have a solution but you have contradicted yourself in the answer. If my solutions work then 78 must be at the start of a line. It will not find 78
anywhere else such as:
Quote:
No, 78 is not at the start of the line. For some file, it is in the middle, for some other files, it might be at the start or end.
Only for the files where 78 is at the start of the line can the solution work. It also does not avoid the problem of 785.

Good luck.
 
Old 09-19-2011, 10:30 AM   #9
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by grail View Post
I am happy you have a solution but you have contradicted yourself in the answer. If my solutions work then 78 must be at the start of a line. It will not find 78
anywhere else such as:

Only for the files where 78 is at the start of the line can the solution work. It also does not avoid the problem of 785.

Good luck.
Oops, yes. I made a mistake! Yes I run it on my testing file not the real one I'm gonna to work on. 78 is at the beginning of that test file. I was just too hurry to make the conclusion.

But, thanks anyway, I learnt lot from you.

Last edited by cristalp; 09-19-2011 at 10:31 AM.
 
Old 09-19-2011, 10:40 AM   #10
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
I wonder why
Code:
awk '/\y78\y/' FILE
or
Code:
x=78
awk '/\y'$x'\y/' FILE
don't work. Please, can you post some lines of the real file to let us test the suggested solutions?
 
Old 09-19-2011, 10:41 AM   #11
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
So, still unsolved.

If 78 is not at the beginning of the line. Nothing would work.

Code:
x=78
awk -v y=$x '/^y/ {action}' FILE
will not work too.

So, still I need an answer for the EXACT match. Sorry for those useless and misleading replies I have posted. My faults.
If anyone has still some idea on this topic, please post your comment. Any answer would be appreciated. Sorry again for your time and thanks all the same.
 
Old 09-19-2011, 10:44 AM   #12
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
I would stick with the regexp with word boundary solution, trying to understand why they didn't work for you. Which version of awk are you running and on which *nix OS?
 
Old 09-19-2011, 10:47 AM   #13
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
The real file looks like:
Code:
P       1 P1 
  mP       3 P2   4 P2 1  
 mC mI     5 C2 
  oP       16 P222   17 P222 1    18 P2 1 2 1 2   19 P2 1 2 1 2 1  
  oC       21 C222   20 C222 1  
  oF       22 F222 
  oI       23 I222   24 I2 1 2 1 2 1  
  tP       75 P4   76 P4 1    77 P4 2    78 P4 3    89 P422   90 P42 1 2 
           91 P4 1 22   92 P4 1 2 1 2   93 P4 2 22   94 P4 2 2 1 2 
           95 P4 3 22   96 P4 3 2 1 2 
  tI       79 I4   80 I4 1    97 I422   98 I4 1 22 
  hP       143 P3   144 P3 1    145 P3 2    149 P312   150 P321   151 P3 1 12 
           152 P3 1 21   153 P3 2 12   154 P3 2 21   168 P6   169 P6 1  
           170 P6 5    171 P6 2    172 P6 4    173 P6 3    177 P622 
           178 P6 1 22   179 P6 5 22   180 P6 2 22   181 P6 4 22   182 P6 3 22 
  hR       146 R3   155 R32 
  cP       195 P23   198 P2 1 3   207 P432   208 P4 2 32   212 P4 3 32 
           213 P4 1 32
Hope it would help to illustrate things more clearly.

Last edited by cristalp; 09-19-2011 at 10:53 AM.
 
Old 09-19-2011, 10:52 AM   #14
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colucix View Post
I would stick with the regexp with word boundary solution, trying to understand why they didn't work for you. Which version of awk are you running and on which *nix OS?
Thanks for your answer and kind suggestion. I am trying to understand why now.
I use ubuntu 10.04. The awk I believe is Mawk. Would you please have some further comments?

Thanks again!
 
Old 09-19-2011, 10:55 AM   #15
cristalp
Member
 
Registered: Aug 2011
Distribution: Linux Mint
Posts: 103

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by colucix View Post
I wonder why
Code:
awk '/\y78\y/' FILE
or
Code:
x=78
awk '/\y'$x'\y/' FILE
don't work. Please, can you post some lines of the real file to let us test the suggested solutions?
I tried both of them on my real file again. None of them works. Even for my test file which the 78 and 178 are at the start of the line, they also do not work. That's what I've confirmed so far.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Match datetime by the minute (not an exact match by the second) [mysql] hattori.hanzo Programming 1 10-21-2010 06:43 PM
egrep for only words(exact match) msgforsunil Linux - Newbie 4 04-14-2010 06:27 AM
Awk: match string in exact position sebelk Programming 2 10-19-2009 03:15 PM
grep/sed/awk - find match, then match on next line gctaylor1 Programming 3 07-11-2007 09:55 AM
how to find an exact substring match? ldp Programming 7 02-22-2005 07:28 AM


All times are GMT -5. The time now is 02:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration