LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-03-2017, 12:44 PM   #1
fanoflq
Member
 
Registered: Nov 2015
Posts: 397

Rep: Reputation: Disabled
AWK(ward) pattern matching problem?


Code:
#works as expected
 ~/dir $ awk '{ print  $0}' inventory-shipped                                                                    
Jan 13 25 15 115
Feb 15 32 24 226
Mar 15 24 34 228
Apr 31 52 63 420
May 16 34 29 208
Jun 31 42 75 492 lastCol
Jul 24 34 67 436
Aug 15 34 47 316
Sep 13 55 37 277
Oct 29 54 68 525
Nov 20 87 82 577
Dec 17 35 61 401

# Match any word that has a "J". Works as expected
 ~/dir $ awk '/J/ { print  $0}' inventory-shipped 
Jan 13 25 15 115
Jun 31 42 75 492 lastCol
Jul 24 34 67 436

 ~/dir $ awk '/J.*/ { print  $0}' inventory-shipped 
Jan 13 25 15 115
Jun 31 42 75 492 lastCol
Jul 24 34 67 436

# /J*/ matches any word/line containing J, JJ, JJJ, JJJJ, ...  
#Does NOT work as expected.
#Incorrect output: 
#There should be ONLY outputs for lines with J, JJ, JJJ, JJJJ, ... 
 ~/dir $ awk '/J*/ { print  $0}' inventory-shipped 
Jan 13 25 15 115
Feb 15 32 24 226
Mar 15 24 34 228
Apr 31 52 63 420
May 16 34 29 208
Jun 31 42 75 492 lastCol
Jul 24 34 67 436
Aug 15 34 47 316
Sep 13 55 37 277
Oct 29 54 68 525
Nov 20 87 82 577
Dec 17 35 61 401
What did I missed?
 
Old 03-03-2017, 01:01 PM   #2
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 6,897
Blog Entries: 3

Rep: Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584
J* means any line with zero or more J's. See

Code:
man 7 regex
and then scroll down to "An atom followed by '*' matches a sequence of 0 or more matches of the atom."

Maybe you mean a plus + instead.

Code:
awk '/J+/' inventory-shipped
By the way, you can leave off print $0 since it goes without saying. There are a lot of shortcuts like that in awk
 
1 members found this post helpful.
Old 03-03-2017, 01:02 PM   #3
fanoflq
Member
 
Registered: Nov 2015
Posts: 397

Original Poster
Rep: Reputation: Disabled
Thank you.
 
Old 03-03-2017, 01:06 PM   #4
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 6,897
Blog Entries: 3

Rep: Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584
No worries. Also, you can anchor the search to the beginning of the line with a caret ^ at the start of the pattern:

Code:
awk '/^J+/' inventory-shipped
That will match

Code:
Jan 13 25 15 115
Jun 31 42 75 492 lastCol
Jul 24 34 67 436
but not

Code:
aJn 13 25 15 115
uJn 31 42 75 492 lastCo
Aug 24 34 67 436 J
 
Old 03-03-2017, 01:11 PM   #5
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693
Quote:
An atom followed by '*' matches a sequence of 0 or more matches of the atom.
I must be reading this wrong.

If I have a string equal to: 1111111111111111111111111

And I search for '/A*/' it will not 'match' and will not exit as successful. Even though there are 0 occurrences. So which is it? Does '/A*/' successfully match when there are no matches (0) or not?
 
Old 03-03-2017, 01:17 PM   #6
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 6,897
Blog Entries: 3

Rep: Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584Reputation: 3584
Quote:
Originally Posted by szboardstretcher View Post
I must be reading this wrong.

If I have a string equal to: 1111111111111111111111111

And I search for '/A*/' it will not 'match' and will not exit as successful. Even though there are 0 occurrences. So which is it? Does '/A*/' successfully match when there are no matches (0) or not?
Which versions of awk are you using? It should match all the lines. It works for me on GNU Awk, OpenBSD's Awk, and Mawk, just to check three.
 
1 members found this post helpful.
Old 03-03-2017, 01:26 PM   #7
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693Reputation: 1693
My bad: My eyes didn't detect that i had '/AA*/' and not '/A*/'

I blame the writers of awk of course.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] pattern matching with awk with variable RudraB Programming 3 05-12-2013 03:52 PM
[SOLVED] awk with pipe delimited file (specific column matching and multiple pattern matching) lolmon Programming 4 08-31-2011 12:17 PM
[SOLVED] awk pattern matching philipz *BSD 1 05-05-2010 02:21 PM
Pattern matching in a text file - use of AWK?? wtaicken Programming 19 02-06-2009 05:54 PM
complicated pattern matching with awk or sed... alirezan1 Linux - Newbie 1 10-10-2008 06:45 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration