LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 09-07-2010, 11:44 PM   #1
desmond33
LQ Newbie
 
Registered: Oct 2003
Distribution: Zen Walk 4.2, Slackware 11, Debian 3.1
Posts: 13

Rep: Reputation: 0
curl Question


Hi. I purchased a book that comes with access to an online archive of images, and I want to download all of them. The website is only set up to download the images one at a time, though, so I want to use a program to automatically download them.

The website is http://www.taschen.com/pages/en/comm..._1/index.1.htm

I tried the command:

Code:
curl -u username -O http://www.taschen.com/media_archives/type1/downloads/_Q6Q5783.jpg.zip
but it downloaded the following text file:

Code:
Found
The document has moved here.

Apache/2.2.8 (Ubuntu) mod_python/3.3.1 Python/2.5.2 PHP/5.2.4-2ubuntu5.10 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8 Server at www.taschen.com Port 80
Which redirects to http://www.taschen.com/type1

This is the first time I've downloaded something in this way, any help is greatly appreciated. I'm using curl, btw, because I'm using Mac OS, which does not come with wget.
 
Old 09-08-2010, 12:50 AM   #2
14moose
Member
 
Registered: May 2010
Posts: 83

Rep: Reputation: Disabled
Hi -

I tried to look, but Taschen is password-protected.

SUGGESTION:
It sounds like the "downloads" URL you tried is simply no longer valid.

Manually copy one of the files. Once you know the exact URL for one of the files, then you might have a better chance using curl for the rest of them.
 
Old 09-08-2010, 01:04 AM   #3
desmond33
LQ Newbie
 
Registered: Oct 2003
Distribution: Zen Walk 4.2, Slackware 11, Debian 3.1
Posts: 13

Original Poster
Rep: Reputation: 0
That's the thing—the url that I put in the curl command works fine when I type it in firefox; the download starts like any other zip file, though if I haven't logged in to the website already it redirects me to a login page. So I imagine the issue is getting curl to look like an authenticated user.
 
Old 09-08-2010, 09:44 AM   #4
14moose
Member
 
Registered: May 2010
Posts: 83

Rep: Reputation: Disabled
Hi, again -

Try this:
Quote:
http://ask.metafilter.com/18923/How-...okie-with-CURL

If you mean the username and password are entered in a form on a login page, then cURL can "submit" that form like:

curl -d "username=miniape&password=SeCrEt" http://whatever.com/login

and if you want to store the cookie that comes back you do so by specifying a cookie file:

curl -c cookies.txt -d "username=miniape&password=SeCrEt" http://whatever.com/login

and to use those cookie in later requests you do:

curl -b cookies.txt -d "username=miniape&password=SeCrEt" http://whatever.com/login

or do both if you want to both send and receive cookies:

curl -b cookies.txt -c cookies.txt -d "username=miniape&password=SeCrEt" http://whatever.com/login
 
Old 09-10-2010, 09:19 AM   #5
Valery Reznic
ELF Statifier author
 
Registered: Oct 2007
Posts: 676

Rep: Reputation: 137Reputation: 137
Quote:
Originally Posted by desmond33 View Post
Hi. I purchased a book that comes with access to an online archive of images, and I want to download all of them. The website is only set up to download the images one at a time, though, so I want to use a program to automatically download them.

The website is http://www.taschen.com/pages/en/comm..._1/index.1.htm

I tried the command:

Code:
curl -u username -O http://www.taschen.com/media_archives/type1/downloads/_Q6Q5783.jpg.zip
but it downloaded the following text file:

Code:
Found
The document has moved here.

Apache/2.2.8 (Ubuntu) mod_python/3.3.1 Python/2.5.2 PHP/5.2.4-2ubuntu5.10 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8 Server at www.taschen.com Port 80
Which redirects to http://www.taschen.com/type1

This is the first time I've downloaded something in this way, any help is greatly appreciated. I'm using curl, btw, because I'm using Mac OS, which does not come with wget.
Looks lie you have to specify -L option. From curl manpage:
Code:
       -L/--location
              (HTTP/HTTPS) If the server reports that the requested  page  has
              moved to a different location (indicated with a Location: header
              and a 3XX response code), this option will make  curl  redo  the
              request  on the new place.
 
Old 09-10-2010, 02:32 PM   #6
Valery Reznic
ELF Statifier author
 
Registered: Oct 2007
Posts: 676

Rep: Reputation: 137Reputation: 137
Quote:
Originally Posted by desmond33 View Post
Hi. I purchased a book that comes with access to an online archive of images, and I want to download all of them. The website is only set up to download the images one at a time, though, so I want to use a program to automatically download them.

The website is http://www.taschen.com/pages/en/comm..._1/index.1.htm

I tried the command:

Code:
curl -u username -O http://www.taschen.com/media_archives/type1/downloads/_Q6Q5783.jpg.zip
but it downloaded the following text file:

Code:
Found
The document has moved here.

Apache/2.2.8 (Ubuntu) mod_python/3.3.1 Python/2.5.2 PHP/5.2.4-2ubuntu5.10 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g mod_perl/2.0.3 Perl/v5.8.8 Server at www.taschen.com Port 80
Which redirects to http://www.taschen.com/type1

This is the first time I've downloaded something in this way, any help is greatly appreciated. I'm using curl, btw, because I'm using Mac OS, which does not come with wget.
Looks like you have to specify -L option. From curl manpage:
Code:
       -L/--location
              (HTTP/HTTPS) If the server reports that the requested  page  has
              moved to a different location (indicated with a Location: header
              and a 3XX response code), this option will make  curl  redo  the
              request  on the new place.
 
Old 09-12-2010, 06:12 PM   #7
desmond33
LQ Newbie
 
Registered: Oct 2003
Distribution: Zen Walk 4.2, Slackware 11, Debian 3.1
Posts: 13

Original Poster
Rep: Reputation: 0
Thanks for the help. I've figured out how to download a file now by using a firefox add-on called "live headers" to look at the cookies used by firefox, and then using a curl command like

Code:
curl -b "name1=value1; name2=value2" http://example.com/file.zip -O
Now I'm just trying to figure out how to download multiple files in the same directory automatically.
 
Old 09-13-2010, 01:25 AM   #8
desmond33
LQ Newbie
 
Registered: Oct 2003
Distribution: Zen Walk 4.2, Slackware 11, Debian 3.1
Posts: 13

Original Poster
Rep: Reputation: 0
It turns out that curl cannot download recursively, which is a shame. It can download a range of sequentially numbered files, though. The files that I wanted to download were not named sequentially, but the html files that link to them are, so I downloaded all of them with curl and put them into a text file with this command

Code:
curl -b "name1=value1; name2=value2" http://www.example.com/index[1-128].html > html_dump.txt
then I used grep and a text editor to make a file with just the filenames that I wanted, ("file_name.zip", for example, with each name on a seperate line) and used a bash script to download them with curl:

Code:
#!/bin/bash

for name in `cat file_list.txt`
do
  curl -b "name1=value1; name2=value2" http://www.example.com/$name -O
done

exit 0
Just in case anyone was curious.

Last edited by desmond33; 09-13-2010 at 02:26 AM.
 
Old 09-13-2010, 01:57 AM   #9
Valery Reznic
ELF Statifier author
 
Registered: Oct 2007
Posts: 676

Rep: Reputation: 137Reputation: 137
Quote:
Originally Posted by desmond33 View Post
It turns out that curl cannot download recursively, which is a shame. It can download a range of sequentially numbered files, though. The files that I wanted to download were not named sequentially, but the hmtl files that link to them are, so I downloaded all of them with curl and put them into a text file with this command

Code:
curl -b "name1=value1; name2=value2" http://www.example.com/index[1-128].html > html_dump.txt
then I used grep and a text editor to make a file with just the filenames that I wanted, ("file_name.zip", for example, with each name on a seperate line) and used a bash script to download them with curl:

Code:
#!/bin/bash

for name in `cat file_list.txt`
do
  curl -b "name1=value1; name2=value2" http://www.example.com/$name -O
done

exit 0
Just in case anyone was curious.
May be wget able to do what you need ?
 
Old 09-13-2010, 02:26 AM   #10
desmond33
LQ Newbie
 
Registered: Oct 2003
Distribution: Zen Walk 4.2, Slackware 11, Debian 3.1
Posts: 13

Original Poster
Rep: Reputation: 0
Yeah, wget would probably be easier, but I didn't feel like compiling/installing it for Mac OS.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
curl bloodsugar Slackware 7 08-17-2009 10:09 AM
cURL: Server has many IPs, how would I make a cURL script use those IPs to send data? guest Programming 0 04-11-2009 11:42 AM
curl help furqan_sindhu Programming 5 09-13-2007 05:14 AM
Quick question on curl and tar hbinded Linux - General 2 04-24-2007 09:38 PM
cURL script question verbatim Programming 8 05-18-2005 04:50 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 08:19 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration