LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-18-2016, 02:26 PM   #1
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Rep: Reputation: 103Reputation: 103
matching only the first string within quotes with grep


I'm reading Sed and Awk Second Edition (1997) where I came across this:
the file "sampleLine" contains:
Code:
.Se "Appendix" "Full Program Listings"

Here’s a different regular expression that matches the shortest possible extent between two quotation marks:
"[ˆ"]*"

It matches “a quote followed by any number of characters that do not match a quote followed by a quote”:
Code:
$ gres ’"[ˆ"]*"’ ’00’ sampleLine
.Se 00 "Full Program Listings"
(For those who don't know, gres is a primitive version of sed; it simply substitues one string with another. The author uses it in order to highlight the match, which can be done with grep --color=auto).

So I tried it myself:
Code:
grep '"[^"]*' sampleLine
.Se "Appendix" "Full Program Listings"
So the question is, why does it highlight both strings within the quotes? And how can I highlight only the first string in the quotes for any line?

Thanks

Last edited by vincix; 12-18-2016 at 02:28 PM.
 
Old 12-18-2016, 02:45 PM   #2
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,950

Rep: Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264
Rub your eyes and look again. You missed a character.
 
Old 12-18-2016, 02:50 PM   #3
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
You're right in that I didn't post the correct sentence, but it still doesn't work using grep '"[^"]*"' sampleLine. It still matches both quoted strings.
 
Old 12-18-2016, 03:07 PM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,022

Rep: Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199
Again you have missed the point that it is matching what you have asked for. Try adding some additional characters and it becomes clearer:
Code:
$ echo '.Se "Appendix" aaaa "Full Program Listings" bbbb' | grep '"[^"]*"'
.Se "Appendix" aaaa "Full Program Listings" bbbb
This is because the tool you are using does not stop after finding the first match. For grep you will need to look at the -m switch.
 
1 members found this post helpful.
Old 12-18-2016, 03:57 PM   #5
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,950

Rep: Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264
grep prints the line if there is a partial match.
For efficiency I expect it to print the line as soon as possible and not try further matches. Showing all matches looks like overhead to me...maybe it does it only if --color is given?
The man pages say -m leaves the file after the first matched line. It does not explicitly say after the first match in that line.
I guess you have to try how the --color works...it is not clearly documented.
 
Old 12-18-2016, 04:04 PM   #6
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
--color highlights all matches, not only the first on the line. So it repeats it on the same line until there are no matches.
So, indeed, the difference consists in the tool that you're using. Obviously, sed (in that case gres works like sed - in the book there's a small script which is a primitive sed, basically) will only match the first occurrence if you don't use the global flag.

So I thought there was a way of matching only the first occurrence through the regular expressions themselves, but I see now that it depends on the tool you're using.
 
Old 12-19-2016, 01:39 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,022

Rep: Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199Reputation: 3199
My bad there, I often forget that -m is for the line not the match

In this particular instance you could get a single match but would need to know something of the matching string, ie. if you said it start with quotes and capital 'a', then you would only get the first string:
Code:
$ echo '.Se "Appendix" aaaa "Full Program Listings" bbbb' | grep '"A[^"]*"'
.Se "Appendix" aaaa "Full Program Listings" bbbb
 
1 members found this post helpful.
Old 12-19-2016, 02:06 AM   #8
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
Thank you for your idea, yes. On the other hand, the initial issue was somehow getting the first match of whole the expression, and I guess that can be done, as I've already said, with sed. But I've no idea how you could do that if you wanted to extract specifically the second or the third match. I'm sure there's a way, right? Can sed do that? I bet awk can.
 
Old 12-19-2016, 02:16 AM   #9
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,461

Rep: Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797
Perhaps use the -o option to grep, then use sed to select the match (below selects the second match).
Code:
echo '.Se "Appendix" aaaa "Full Program Listings" bbbb' | grep -o '"[^"]*"' | sed -n 2p
 
Old 12-19-2016, 02:20 AM   #10
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
Well, yes, but I don't think it works for a file whose lines might contain ten matches or none at all, does it?
 
Old 12-19-2016, 02:37 AM   #11
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,267

Rep: Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164Reputation: 4164
Use PCRE - much more flexible.
 
Old 12-19-2016, 04:07 AM   #12
vincix
Senior Member
 
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240

Original Poster
Rep: Reputation: 103Reputation: 103
Right, I'm currently struggling with sed/awk. There's a long way to learning perl, if ever.
 
Old 12-19-2016, 04:22 AM   #13
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,968

Rep: Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619Reputation: 7619
Quote:
Originally Posted by vincix View Post
Well, yes, but I don't think it works for a file whose lines might contain ten matches or none at all, does it?
I think this is something like a "limitation": using grep without colors will print the line - and that is fine, using colors grep will print all the occurrences, because it has no idea which one do you really need. There is no way to easily select the second or fifth match.
You can use grep -P (that is PCRE, was mentioned by syg00) and you can (try to) construct a regexp which will only valid for the given match. But that is - I would say - advanced usage of PCRE.
 
Old 12-19-2016, 07:35 AM   #14
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,461

Rep: Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797Reputation: 2797
Quote:
Well, yes, but I don't think it works for a file whose lines might contain ten matches or none at all, does it?
True. Are you now convinced you need a new tool?
Quote:
Right, I'm currently struggling with sed/awk. There's a long way to learning perl, if ever.
These are your new tools.
 
Old 12-19-2016, 08:12 AM   #15
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,571
Blog Entries: 4

Rep: Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862Reputation: 3862
Quote:
Originally Posted by allend View Post
True. Are you now convinced you need a new tool?
Quote:
Right, I'm currently struggling with sed/awk. There's a long way to learning perl, if ever.
These are your new tools.
Indeed.

grep < sed < awk < perl
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] regex pattern matching, with open braces or quotes being closed ansh007 Linux - General 11 08-31-2016 03:10 AM
Find/grep command to find matching files, print filename, then print matching content stefanlasiewski Programming 9 06-30-2016 06:30 PM
How to show selected string using grep from file and replace it with new input string prasad1990 Linux - Software 2 03-19-2015 09:02 AM
[SOLVED] Matching double quotes " hattori.hanzo Programming 7 11-24-2010 01:49 AM
grep command and quotes metalenkist Linux - Newbie 4 12-16-2009 06:32 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration