Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
|
12-18-2016, 02:26 PM
|
#1
|
Senior Member
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240
Rep:
|
matching only the first string within quotes with grep
I'm reading Sed and Awk Second Edition (1997) where I came across this:
the file "sampleLine" contains:
Code:
.Se "Appendix" "Full Program Listings"
Here’s a different regular expression that matches the shortest possible extent between two quotation marks:
"[ˆ"]*"
It matches “a quote followed by any number of characters that do not match a quote followed by a quote”:
Code:
$ gres ’"[ˆ"]*"’ ’00’ sampleLine
.Se 00 "Full Program Listings"
(For those who don't know, gres is a primitive version of sed; it simply substitues one string with another. The author uses it in order to highlight the match, which can be done with grep --color=auto).
So I tried it myself:
Code:
grep '"[^"]*' sampleLine
.Se "Appendix" "Full Program Listings"
So the question is, why does it highlight both strings within the quotes? And how can I highlight only the first string in the quotes for any line?
Thanks
Last edited by vincix; 12-18-2016 at 02:28 PM.
|
|
|
12-18-2016, 02:45 PM
|
#2
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,950
|
Rub your eyes and look again. You missed a character.
|
|
|
12-18-2016, 02:50 PM
|
#3
|
Senior Member
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240
Original Poster
Rep:
|
You're right in that I didn't post the correct sentence, but it still doesn't work using grep '"[^"]*"' sampleLine. It still matches both quoted strings.
|
|
|
12-18-2016, 03:07 PM
|
#4
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,022
|
Again you have missed the point that it is matching what you have asked for. Try adding some additional characters and it becomes clearer:
Code:
$ echo '.Se "Appendix" aaaa "Full Program Listings" bbbb' | grep '"[^"]*"'
.Se "Appendix" aaaa "Full Program Listings" bbbb
This is because the tool you are using does not stop after finding the first match. For grep you will need to look at the -m switch.
|
|
1 members found this post helpful.
|
12-18-2016, 03:57 PM
|
#5
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,950
|
grep prints the line if there is a partial match.
For efficiency I expect it to print the line as soon as possible and not try further matches. Showing all matches looks like overhead to me...maybe it does it only if --color is given?
The man pages say -m leaves the file after the first matched line. It does not explicitly say after the first match in that line.
I guess you have to try how the --color works...it is not clearly documented.
|
|
|
12-18-2016, 04:04 PM
|
#6
|
Senior Member
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240
Original Poster
Rep:
|
--color highlights all matches, not only the first on the line. So it repeats it on the same line until there are no matches.
So, indeed, the difference consists in the tool that you're using. Obviously, sed (in that case gres works like sed - in the book there's a small script which is a primitive sed, basically) will only match the first occurrence if you don't use the global flag.
So I thought there was a way of matching only the first occurrence through the regular expressions themselves, but I see now that it depends on the tool you're using.
|
|
|
12-19-2016, 01:39 AM
|
#7
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,022
|
My bad there, I often forget that -m is for the line not the match
In this particular instance you could get a single match but would need to know something of the matching string, ie. if you said it start with quotes and capital 'a', then you would only get the first string:
Code:
$ echo '.Se "Appendix" aaaa "Full Program Listings" bbbb' | grep '"A[^"]*"'
.Se "Appendix" aaaa "Full Program Listings" bbbb
|
|
1 members found this post helpful.
|
12-19-2016, 02:06 AM
|
#8
|
Senior Member
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240
Original Poster
Rep:
|
Thank you for your idea, yes. On the other hand, the initial issue was somehow getting the first match of whole the expression, and I guess that can be done, as I've already said, with sed. But I've no idea how you could do that if you wanted to extract specifically the second or the third match. I'm sure there's a way, right? Can sed do that? I bet awk can.
|
|
|
12-19-2016, 02:16 AM
|
#9
|
LQ 5k Club
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,461
|
Perhaps use the -o option to grep, then use sed to select the match (below selects the second match).
Code:
echo '.Se "Appendix" aaaa "Full Program Listings" bbbb' | grep -o '"[^"]*"' | sed -n 2p
|
|
|
12-19-2016, 02:20 AM
|
#10
|
Senior Member
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240
Original Poster
Rep:
|
Well, yes, but I don't think it works for a file whose lines might contain ten matches or none at all, does it?
|
|
|
12-19-2016, 02:37 AM
|
#11
|
LQ Veteran
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,267
|
Use PCRE - much more flexible.
|
|
|
12-19-2016, 04:07 AM
|
#12
|
Senior Member
Registered: Feb 2011
Distribution: Ubuntu, Centos
Posts: 1,240
Original Poster
Rep:
|
Right, I'm currently struggling with sed/awk. There's a long way to learning perl, if ever.
|
|
|
12-19-2016, 04:22 AM
|
#13
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,968
|
Quote:
Originally Posted by vincix
Well, yes, but I don't think it works for a file whose lines might contain ten matches or none at all, does it?
|
I think this is something like a "limitation": using grep without colors will print the line - and that is fine, using colors grep will print all the occurrences, because it has no idea which one do you really need. There is no way to easily select the second or fifth match.
You can use grep -P (that is PCRE, was mentioned by syg00) and you can (try to) construct a regexp which will only valid for the given match. But that is - I would say - advanced usage of PCRE.
|
|
|
12-19-2016, 07:35 AM
|
#14
|
LQ 5k Club
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,461
|
Quote:
Well, yes, but I don't think it works for a file whose lines might contain ten matches or none at all, does it?
|
True. Are you now convinced you need a new tool?
Quote:
Right, I'm currently struggling with sed/awk. There's a long way to learning perl, if ever.
|
These are your new tools.
|
|
|
12-19-2016, 08:12 AM
|
#15
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,571
|
Quote:
Originally Posted by allend
True. Are you now convinced you need a new tool?
Quote:
Right, I'm currently struggling with sed/awk. There's a long way to learning perl, if ever.
|
These are your new tools.
|
Indeed.
grep < sed < awk < perl
|
|
|
All times are GMT -5. The time now is 07:22 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|