LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   wget help (https://www.linuxquestions.org/questions/linux-newbie-8/wget-help-193959/)

Longinus 06-15-2004 07:40 PM

wget help
 
i need some help with wget

i need to get all .jpg files and all .html files from a directory

how would i do this?

thanks

penguin4 06-15-2004 09:31 PM

newbies all; as a newbie have taken time and effort to be informed. how you ask? good ?, start with manuals, mdk has good ones,usr,s;reference
and the cd,s plethora of information till your eyeball bulge! but thats where
i get my info to pass on to any body. thats the learning curve that needs to
be shared with everybody! oh, you downloaded ur linux well help the org;
by buying the manuals it will help you and the organziation. last word, We
are better than windows yes! of course! LINUX RULES!

Longinus 06-15-2004 09:44 PM

lol

Dark_Helmet 06-15-2004 10:23 PM

I looked at the man pages for wget, and unfortunately, file-globbing is only supported for ftp transfers, and not http. I'm sure you knew that already though.

You may be forced to mirror the site, and then delete the non-html and non-jpeg files after the fact by using a script. I dunno... maybe it's possible to use wget to collect a list of the files on the server, then use grep to weed out those you don't want, and then wget the remaining through repeated wget calls (all through a script of course). It would be easy to do with the exception of getting the file list. Maybe wget has a way to do that... I might look at it again in a few minutes.

Dark_Helmet 06-15-2004 11:09 PM

I think your only option would be to use the -A (--accept) option. I haven't tried it myself, but something like this might work:

wget http://some.website.com/wherever -r --level=1 --accept html,jpg

You might have to modify the --level option, but like I said, I haven't tried it. From what the man pages say, it looks like it should.


All times are GMT -5. The time now is 09:02 PM.