LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 05-10-2004, 07:23 PM   #1
GT_Onizuka
Member
 
Registered: Aug 2003
Location: Atlanta
Distribution: Debian, OS X
Posts: 711

Rep: Reputation: 31
wget download all files of certain type


What would the specific wget command be to download all files, say ending in .zip, from a certain directory on a website? It would be an HTTP download, not FTP, and is there anyway that I can set a gap between the downloads so I don't completely hammer the website? It would just be inconvenient having to sit and click every "Download" button, when this could be much easier and I could do it over a much length period of time. Also, I'm assuming it would recognize wget as a bot of some sort, and potentially say, "Haha, no you can't download", is there a way to trick the website into thinking I'm your average run-of-the-mill browser?
 
Old 05-10-2004, 08:33 PM   #2
syd
LQ Newbie
 
Registered: Aug 2003
Distribution: Ubuntu 8.04
Posts: 7

Rep: Reputation: 0
Download recursively, specify a file pattern, and specify a wait time, like this:

wget -r -A "*.zip" -w 30 http://www.foobar.com/foo/bar.html

assuming the zip files are pointed to by links on the bar.html page.

If the links are to different subdomains, you can specify host-spanning using the -H option, e.g. if bar.html contains links to files on host src.foobar.com, it won't fetch them unless you specify -H. It's also a good idea in that case to limit spanning to a domain using -D foobar.com.

The -w option waits 30 seconds between retrievals, I think.

You can also use --limit-rate=20k to limit the download speed to 20kb per second.

To "trick" a web server into thinking you are using another browser, you can use the --user-agent="User Agent String" option. wget man page says this is discouraged, though.

try man wget for more info
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
wget and slackware download ivanatora Slackware 4 01-12-2008 05:12 AM
WGET: How do I cancel a download? PionexUser Linux - Software 3 12-06-2005 12:30 PM
wget wont download ?? paul_mat Linux - Software 1 11-01-2005 12:45 AM
I want to download ftp-site files via wget and socks5 proxy server. jiawj Red Hat 2 10-28-2004 03:32 PM
what to do with 5 parts of wget download Bruce Hill Linux - Software 2 09-11-2003 10:47 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:18 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration