LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 11-30-2007, 07:57 AM   #1
Anant Khaitan
LQ Newbie
 
Registered: Feb 2007
Location: NIT, Bhopal
Distribution: Fedora
Posts: 16

Rep: Reputation: 0
Use wget to download multiple files with wildcards


I am trying to download all jpg files from a particular http site.. tell me the exact syntax ...
I have tried this :
Code:
$  wget -r -l1 --no-parent -A jpg  http://www.mikeswanson.com/wallpaper/images/
but it is not working..
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 11-30-2007, 08:12 AM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Quote:
tell me the exact syntax ...
Please??

From the wget man page:
Quote:
You want to download all the GIFs from a directory on an HTTP
server. You tried wget http://www.server.com/dir/*.gif, but that
didn't work because HTTP retrieval does not support globbing. In
that case, use:

wget -r -l1 --no-parent -A.gif http://www.server.com/dir/

More verbose, but the effect is the same. -r -l1 means to retrieve
recursively, with maximum depth of 1. --no-parent means that ref‐
erences to the parent directory are ignored, and -A.gif means to
download only the GIF files. -A "*.gif" would have worked too.
man pages are not exactly light reading, but they usually have the answer.
 
2 members found this post helpful.
Old 11-30-2007, 11:09 AM   #3
Anant Khaitan
LQ Newbie
 
Registered: Feb 2007
Location: NIT, Bhopal
Distribution: Fedora
Posts: 16

Original Poster
Rep: Reputation: 0
Thumbs down

^^
Brother I have already tried what u had mentioned...It works for certain sites...Indeed I use this for downloading entire site..
check my first post..
Here some permission problem.. "403 forbidden"...
and
Code:
-A gif
or
Code:
-A.gif
It doesn't make any difference here...

So can I expect a constructive reply ..
Ok here is the output of the command :
Code:
--22:24:40--  http://www.mikeswanson.com/wallpaper/images/
           => `www.mikeswanson.com/wallpaper/images/index.html'
Resolving www.mikeswanson.com... 209.132.227.101
Connecting to www.mikeswanson.com|209.132.227.101|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
22:24:41 ERROR 403: Forbidden.

Removing www.mikeswanson.com/wallpaper/images/index.html since it should be rejected.
unlink: No such file or directory

FINISHED --22:24:41--
Downloaded: 0 bytes in 0 files

Last edited by Anant Khaitan; 11-30-2007 at 11:21 AM.
 
Old 11-30-2007, 11:31 AM   #4
Fluffy
LQ Newbie
 
Registered: Nov 2007
Distribution: Slackware64-current
Posts: 16

Rep: Reputation: 0
Notice these lines:
HTTP request sent, awaiting response... 403 Forbidden
22:24:41 ERROR 403: Forbidden.


Even if you got the syntax right, you wouldn't be able to download all the images anyways.

Code:
wget -r -l1 --no-parent -A jpg  http://www.mikeswanson.com/wallpaper/images/
Should be
Code:
wget -r -|1 --no-parent -A.jpg http://www.mikeswanson.com/wallpaper/images/
Notice the difference between -l1 (dash L one) and -|1 (dash pipe one) also -A jpg and -A.jpg.
 
Old 11-30-2007, 12:17 PM   #5
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
First this:
Quote:
tell me the exact syntax ...
Then this:
Quote:
So can I expect a constructive reply ..
I would really encourage you to start using please and thank you in place of these commanding statements......
 
Old 11-30-2007, 11:38 PM   #6
Anant Khaitan
LQ Newbie
 
Registered: Feb 2007
Location: NIT, Bhopal
Distribution: Fedora
Posts: 16

Original Poster
Rep: Reputation: 0
^^^
Sorry pixellany if I was rude.. nyways thanx for mentioning...

@ Fluffy
Code:
wget -r -|1 --no-parent -A.jpg http://www.mikeswanson.com/wallpaper/images/
returns
Code:
bash: 1: command not found
it should be 'l' means 'level' not '|'
and AFAIK 'A.gif' or 'A gif' it doesn't matter at all
 
Old 12-01-2007, 08:29 AM   #7
Fluffy
LQ Newbie
 
Registered: Nov 2007
Distribution: Slackware64-current
Posts: 16

Rep: Reputation: 0
Oops. I guess we both screwed up a little then.
Damn font makers need to make a good default font that you can tell the difference between | and l and 1. -_-

But, again, note the 403 forbidden error message you got. With the way mikeswanson.com has it's folders setup you can't view/download from a directory unless you know the exact filename you want to download. (Sometimes not even then.)
 
Old 12-04-2009, 12:03 AM   #8
mattington
LQ Newbie
 
Registered: Sep 2008
Location: Beijing
Distribution: Slackware, Arch
Posts: 10

Rep: Reputation: 0
Lightbulb

I mean.. I realize this thread is 2 years old but there are other wildcard command options open to you. For instance, to get pictures that are numbered in order you can do:

wget -nd http://www.cracked.com/blog/wp-content/uploads/2009/12/zorklon{1,2,3,4,5}.jpg

This will repeat the command for each number enumerated in the {}

Last edited by mattington; 12-04-2009 at 12:16 AM.
 
Old 08-23-2013, 09:45 PM   #9
lag_rvp
LQ Newbie
 
Registered: Aug 2013
Posts: 2

Rep: Reputation: Disabled
Quote:
Originally Posted by Fluffy View Post
Notice these lines:
HTTP request sent, awaiting response... 403 Forbidden
22:24:41 ERROR 403: Forbidden.


Even if you got the syntax right, you wouldn't be able to download all the images anyways.

Code:
wget -r -l1 --no-parent -A jpg  http://www.mikeswanson.com/wallpaper/images/
Should be
Code:
wget -r -|1 --no-parent -A.jpg http://www.mikeswanson.com/wallpaper/images/
Notice the difference between -l1 (dash L one) and -|1 (dash pipe one) also -A jpg and -A.jpg.
And further bumping an old thread on my first post no less.

This worked for me to DL all of the .mp3's in a directory, just used your same code minuse the -|1 for the website I ripped.

Code:
wget -r --no-parent -A.jpg http://www.mikeswanson.com/wallpaper/images/

Last edited by lag_rvp; 08-23-2013 at 09:47 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
download only zip files using wget command Fond_of_Opensource Linux - Newbie 1 08-09-2006 03:47 AM
wget fail to download pdf files powah Linux - Software 2 05-04-2006 03:38 PM
WGET: How do I cancel a download? PionexUser Linux - Software 3 12-06-2005 12:30 PM
I want to download ftp-site files via wget and socks5 proxy server. jiawj Red Hat 2 10-28-2004 03:32 PM
wget download all files of certain type GT_Onizuka Linux - Software 1 05-10-2004 08:33 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 12:04 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration