LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (http://www.linuxquestions.org/questions/linux-general-1/)
-   -   Grep to pull specific hyperlinks out of files (http://www.linuxquestions.org/questions/linux-general-1/grep-to-pull-specific-hyperlinks-out-of-files-669085/)

ExoZagNoid 09-10-2008 11:37 PM

Grep to pull specific hyperlinks out of files
 
Hey people,

Basically I'm looking for a grep command that pulls some Facebook hyperlinks out of multiple text files and dumps all the links into another text file.

Something like

grep -io * http://www.facebook.com/n/?inbox/rea...p&t=?????????? > LinkFile.txt

(not sure if I should use -x or not seeing as how the numbers are different)

They are exactly the same except the 10 numbers at the end. I want to find all these links regardless of what the numbers are and push them into a file. If anyone is super ambitious it'd be neat to make it so the same link doesn't get repeated AND/OR make each link clickable in an HTM file. :)

I've used some simple grep in the past, but I think this might use regular expressions (which I have not much experience). I've tried this a couple times with horrible results.

Anyways, if not an answer, I'm sure a clue or example would get me going.
Thx,
Exo

Mr. C. 09-11-2008 01:00 AM

Put single quotes around your expressions when they contain shell metacharacters.
Code:

grep -io * 'http://www.facebook.com/n/?inbox/rea...p&t=??????????' > LinkFile.txt
Use egrep or grep -E to use extended regular expressions. Use the [[:digit:]] character class to match digits. Use the quantifier {10} to indicate that you want 10 of the previous atoms. Use backslash to escape special regular expression characters, such as ?, . (dot) and &.

Code:

grep -Eio * 'http://www\.facebook\.com/n/\?inbox/rea...p\&t=[[:digit:]]{10}' > LinkFile.txt


All times are GMT -5. The time now is 07:22 AM.