wget - downloading files from a directory
Hello, i'd appreciate if somebody could help me with this.
What i'm trying to do is this: download all files from a directory on a web-server (no subfolders and their contents; no upper-level folder contents) e.g. Code:
http://url/dir/subdir/one/two I've been struggling with this for quite a long time and tried probably all combinations with -r --no-parent -l# -A -R switches (reasonable and stupid combinations) - i can't figure this out. I've read man pages and different online how-to's. I'm about to give up on wget :))) Here's the practical question: download all files from this(vamps) directory.(probably 1-1.5 megs at most) Code:
http://http.us.debian.org/debian/pool/main/v/vamps/ I hope it's possible! Thanks in advance |
Have you tried setting the recursion depth --level=0 this should prevent any recursion,
also -nd will tell it no directories on the local machine. |
Thanks for replying. I have used -l0, and -nd..
Right now my command looks like this: Code:
wget -r -l0 -nd --no-parent -A "vamps*" -R ".*" http://http.us.debian.org/debian/pool/main/v/vamps/ "Removing index.html since it should be rejected" // thus my -A and -R filters Even though i do end up only with vamps files downloaded(SOME progress, at least), why does it go on downloading upper folders' index.html's and then rejecting them?.. The only way to stop it is Ctr-C when you notice too many "Removing..." lines flicking by.. |
I'm not sure, but since you have -A "vamps*" and that is all you want, I don't think
you need the -R, try removing that and moving the --no-parent to the very last option. |
From the directory where you want the files to be downloaded to:
Quote:
--cut-dirs=5 will remove 'debian/pool/main/v/vamps' from the downloaded file names. |
Thanks for replies guys.
Quote:
I don't know, is there some other program that people use for downloading like this? I know that with some ftp clients you browse into folders and download files with simple wildcard masks (e.g. vamps*), but what about http? |
This seems to work but it downloads 5 extra files to the 16 required. The extra files are from links in the vamps directory and are automatically deleted by 'wget' as it implements the wild card filter 'vamps*'. It gives just the files without any directories:
Code:
wget -r -nH -l1 --cut-dirs=5 --no-parent -A "vamps*" http://http.us.debian.org/debian/pool/main/v/vamps/ Code:
dir=http://http.us.debian.org/debian/pool/main/v/vamps/ Code:
dir=http://http.us.debian.org/debian/pool/main/v/vamps/ |
In the 1000 alternatives:
Code:
elinks "URL" | grep -o 'http:[^"]*' | grep vamp | xargs wget -k |
Thank you all for replies. I will try those later. I thought there was an easier way, something i was missing. I guess not, but thanks :)
|
All times are GMT -5. The time now is 08:31 AM. |