printing pattern match and not whole line that matches pattern
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
printing pattern match and not whole line that matches pattern
Hi all.
I've been jumping between the manuals of grep, awk and sed to find a way to print the match of a pattern.
Grep seems able to print the entire line that matches the regular expression, but I want to print only the string that matches the regular expression. I could not find anything in awk or sed manuals.
For example I have a html file that has many links in it. I want to output the location of the links to a plain text file. So I would need to make a regular expression similar to the following:
Code:
href="[^"\r\n]*"
that matches everything between the quotes of the href.
I could output this to a file and then remove the href part.
$ echo '<A HREF="xdpyinfo.1.html">xdpyinfo(1)</A>' | sed 's/.*HREF="\(.*\)".*/\1/'
xdpyinfo.1.html
Exactly what I was looking for :-) Only problem: sed prints every line also the ones not mathing. Using -n option suppresses "everything". How can I solve this?
grep -o is nice but doesn't offer the flexibility of using \( \) which allows you to match something bigger but print only part of it.
The sed part used is just a search and print, and is indeed done on all lines in a file.
It's not entirely clear to me what you want to match and what you do not want to match, but the following example should get you going again:
Code:
$ cat sed.infile
a line
another line
<A HREF="xdpyinfo.0.html">xdpyinfo(0)</A>
<A HREF="xdpyinfo.1.html">xdpyinfo(1)</A>
line in the middle
<A HREF="xdpyinfo.2.html">xdpyinfo(2)</A>
<A HREF="xdpyinfo.3.html">xdpyinfo(3)</A>
last line
$ sed -n '/xdpyinfo/s/.*HREF="\(.*\)".*/\1/p' sed.infile
xdpyinfo.0.html
xdpyinfo.1.html
xdpyinfo.2.html
xdpyinfo.3.html
# Patterns such as [^<]*< limit "greedy matching"
sed -n 's/<a href=#Say,\([^>]*\)>[^<]*<</\1/gp' aa
123
234 345
# Adding 's/< </<\n</g' converts the space into a newline
sed -n 's/< </<\n</g; s/<a href=#Say,\([^>]*\)>[^<]*<</\1/gp' aa
123
234
345
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.