LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 02-17-2009, 05:47 PM   #1
aegraham
LQ Newbie
 
Registered: Feb 2005
Location: Kalamazoo
Posts: 3

Rep: Reputation: 0
Extracting hyperlinks from HTML pages listed in a text file


I'm trying to build a sitemap to help cleanup a site that I just took over (as an unpaid volunteer). I have a text file that has a list of all the HTML files that are on the site including their relative directories. Sample:
Code:
./temp_widecol_del.htm
./dogspf.htm
./indextest.htm
./kmtemp/date4.php
./kmtemp/phpinfo.php
./fm/index.html
./fCMSTestGallery/upgrade.php
./fCMSTestGallery/index.html
./fCMSTestGallery/libadmin.php
./fCMSTestGallery/iontest/ioncube-loader-helper.php
./fCMSTestGallery/iontest/ioncube-encoded-file.php
Is there any combination of ls, more, grep (or others) that I can use that would produce something like:
Code:
./indextest.htm
<a href="http://...">Link</a>
<a href="http://...">Link2</a>
./index.htm
<a href="http://...">Link3</a>
<a href="http://...">Link4</a>
OR

Is there any (preferably OSS) programs that might do something similar? Thanks in advance!
 
Old 02-18-2009, 01:33 PM   #2
nx5000
Senior Member
 
Registered: Sep 2005
Location: Out
Posts: 3,307

Rep: Reputation: 57
try this
lynx -listonly -dump index.html
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
editing a very large HTML file (or, extracting URLs from a file) Chriswaterguy Linux - Software 3 11-27-2007 06:07 PM
extracting data from html files into one text file adityavpratap Slackware 9 05-10-2007 10:30 AM
extracting a chunk of text from a large text file lothario Linux - Software 3 02-28-2007 08:16 AM
Can I watch html pages in text mode (e.g. with the help of Emacs, etc)? kornerr Linux - General 6 03-01-2005 10:30 AM
Print text file in html pages ariana Programming 1 11-12-2004 03:18 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 08:50 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration