LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-05-2011, 05:58 PM   #1
markm0705
LQ Newbie
 
Registered: May 2011
Location: Perth WA
Posts: 2

Rep: Reputation: 0
wget ERROR 403: Forbidden


Dear LQ.org

I'm not a Linux user but do find many linux tools work just fine under DOS in windows. GSAR is my current favourite.

I'm currently looking for a tool to remove the tedium of downloading files from some stock exhange sites (part of some reasearch) and found WGET does the downloading job.

WGET works just fine when I know the file name, for example if i send the comand:

wget http://www.asx.com.au/asxpdf/2011050...xb7549y7zx.pdf

I get the file I want. However, to find the file name to download defeats my goal as by the time I've found the name by navigating the site, I've already gone through the download process.

What I would like to do is get all "pdf" files from that directory (overnight) and browse them quickly locally each morning. However when I put in...

wget http://www.asx.com.au/asxpdf/20110504/pdf/

I get...

ERROR 403:Forbidden

I've had a play with using some of the other WGET switches as suggested on other threads (U and cookies) but not to sure I know what I'm doing with these.

My guess is that the site is constructed in a manner to prevent bulk downloads. I'm wondering if there is a unix tool that will allow me to find out the names of the pdf in the web directory so I can use a batch file using WGET to download the list.

Any help much appreciated
 
Old 05-06-2011, 02:26 AM   #2
plpl303a
Member
 
Registered: May 2011
Posts: 52

Rep: Reputation: 3
Perhaps you want

wget --recursive

or

wget --mirror


Of course, if the sites in question have a policy against bulk downloads, they may not be happy about the extra server load. ;-)
 
1 members found this post helpful.
Old 05-06-2011, 03:11 AM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978Reputation: 1978
Quote:
Originally Posted by markm0705 View Post
My guess is that the site is constructed in a manner to prevent bulk downloads. I'm wondering if there is a unix tool that will allow me to find out the names of the pdf in the web directory so I can use a batch file using WGET to download the list.
Unfortunately not. Every attempt will be rejected if the remote server doesn't allow directory listing. If you succeed, you will have broken a security wall and yours will be considered a site attack. I think your best bet is to contact the webmaster and kindly ask information about the content available for downloading.
 
1 members found this post helpful.
Old 05-08-2011, 06:47 PM   #4
markm0705
LQ Newbie
 
Registered: May 2011
Location: Perth WA
Posts: 2

Original Poster
Rep: Reputation: 0
Dear plpl303a and colucix thanks for your replies and comments. My aim is certainly to come up with something that is clever enough to avoid the evils of bulk downloading if at all possible. I shall let you know if I have any succsses with --recursive or mirror switched.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
403 - Forbidden(Apache error) amritpalpathak Linux - Software 5 04-21-2011 02:08 PM
403 forbidden error with apache2 shifter Linux - Networking 1 01-20-2010 03:56 PM
Forbidden 403 error with my apache entz Linux - Software 2 12-17-2007 06:33 PM
403 Forbidden error was encountered pilot11 Linux - Newbie 7 10-18-2006 12:28 PM
HTTP 403 (Forbidden) error lothario Fedora 6 02-08-2005 05:43 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 03:48 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration