LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
LinkBack Search this Thread
Old 07-23-2005, 09:18 AM   #1
kpachopoulos
Member
 
Registered: Feb 2004
Location: Athens, Greece
Distribution: Gentoo,FreeBSD, Debian
Posts: 704

Rep: Reputation: 30
wget question


Attempting to do a mirroring task of an http repository, i also get html's downloaded- something i don't want to.
I tried "wget -lalala --reject *.html url", "wget -lalala --reject html url", "wget -lalala --delete-after *.html url" and "wget -lalala --reject html url", but it doesn't work.
Any ideas?
 
Old 07-23-2005, 10:00 AM   #2
rjlee
Senior Member
 
Registered: Jul 2004
Distribution: Ubuntu 7.04
Posts: 1,990

Rep: Reputation: 65
Re: wget question

One minor point: *.html will be expanded by the shell to the names of all the .html files in the current directory; it won't pass the string '*.html' to wget (you should put it in quotes if that's what you meant to do).

wget stores the link locations in the HTML files, so you can't remove them with a reject option (-reject stops files from being downloaded in the first place; wget wouldn't have anything to recurse). From the info page on recursive accept/reject:
Quote:
Note that these two options do not affect the downloading of HTML files; Wget must load all the HTMLs to know where to go at all--recursive retrieval would make no sense otherwise.
I suggest that you look at deleting the HTML files after wget has run. This should do it (although it's not tested):
Code:
find . -iname "*.html" -exec rm '{}' ';'
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
wget question Jestrik Linux - Software 4 05-01-2005 07:41 PM
wget question Yohhan Linux - Software 1 05-05-2004 07:49 PM
wget question ziggie216 Linux - Software 1 10-12-2003 01:10 PM
wget question ziggie216 Linux - Software 0 10-12-2003 12:02 PM
wget question satimis Linux - Software 4 07-14-2003 04:25 AM


All times are GMT -5. The time now is 02:48 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration