LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-19-2013, 02:28 PM   #1
cstrieder
Member
 
Registered: Sep 2008
Location: Brazil
Distribution: Slackware
Posts: 58

Rep: Reputation: 15
Wget options


Hi all,

To get the PDF files under http://ctsgepc7.epfl.ch/, the following command was tried:

wget -r -A.pdf http://ctsgepc7.epfl.ch/

But it do not got what was expected. The file "S3-C-1-1-CDR Electrical_ICD.pdf", which is in "01 - Systems and mission documents/Interface Control Documents", for example did not come.

Is it possible to retrieve this file, and the others in the same folder using wget?

Thanks in advance.

Last edited by cstrieder; 06-19-2013 at 02:30 PM.
 
Old 06-19-2013, 04:13 PM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
They appear to have a robots.txt in place to restrict bulk mirroring.
 
Old 06-19-2013, 04:18 PM   #3
cstrieder
Member
 
Registered: Sep 2008
Location: Brazil
Distribution: Slackware
Posts: 58

Original Poster
Rep: Reputation: 15
So, the only option is to get by hand every file?
 
Old 06-19-2013, 04:33 PM   #4
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,260

Rep: Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948Reputation: 1948
You can set "-e robots=off" in the wget command to ignore it.

Code:
wget -r -A.pdf -e robots=off http://ctsgepc7.epfl.ch/

Last edited by suicidaleggroll; 06-19-2013 at 04:34 PM.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
wget options usef62 Linux - Software 3 07-01-2012 11:43 AM
How to resume an interrupted wget using wget.log? misterJ Linux - Software 2 06-19-2011 02:21 PM
Select options from website to initiate download from script (wget alternative?) hattori.hanzo Programming 1 11-18-2010 09:17 AM
wget options for directory structure davimint Linux - Server 3 05-18-2007 11:24 PM
Wget command options help??? tekhead2 Linux - Software 1 06-23-2004 07:58 AM


All times are GMT -5. The time now is 09:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration