Regular Expression

0.o · 06-08-2009, 08:32 AM

I am trying to build a regular expression that will match the following:

http://www.linkedin.com/[a-zA-Z0-9]+
(the above URL followed by anything on the site)

Could someone point me in the right direction?

Thanks!

ghostdog74 · 06-08-2009, 08:38 AM

show an actual example of the input file you are parsing. basically, there is no need for a regular expression. the general logic can be:

Code:

if "http://" in string

pixellany · 06-08-2009, 08:39 AM

What utility are you using? Depending on the context, you may already have it....

You can also use:
http://www.linkedin.com/.*

vonbiber · 06-09-2009, 02:28 AM

Quote:

Originally Posted by 0.o

I am trying to build a regular expression that will match the following:

http://www.linkedin.com/[a-zA-Z0-9]+
(the above URL followed by anything on the site)

Could someone point me in the right direction?

Thanks!

you want to get what's after 'http://www.linkedin.com/'?
if you're parsing an html file and you want
to retrieve this info from the hyperlinks found in file.html

sed 's?href=[^ >]*?\n&\n?g' file.html | grep -i 'href=' | \
grep 'http://www\.linkedin\.com' | \
sed 's?^.*http://www\.linkedin\.com/$[^'" \t>]*$.*$?\1?' | \
sort -u

does that answer your question?