Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
08-06-2008, 11:49 PM
|
#1
|
Member
Registered: Feb 2005
Distribution: Debian, Kanotix, Kubuntu
Posts: 117
Rep:
|
wget how to specify a numeric range and download images from a website ?
I was needing images on a server which are on the server but i have to access them manually.
It would be for example www.xyz.com/collection/846_02_900_x.jpg
When i change to /collection/847_02_900_x.jpg (just one number higher) there is a picture as well, and i need all the pictures in the collection folder.. However, if i just go to www.xyz.com/collection there are no images, it just says the there was an error... So without me having to sit 3 hours and do it all manually, going from 846_02_900_x.jpg 847_02_900_x.jpg 848_02_900_x.jpg etc.. i was wondering if wget can download the pictures somehow?
When i try it with wget -m -p -k http://www.xyz.com/collection all i get is HTTP request sent, awaiting response... 404 Not Found
And of course when i do wget -m -p -k www.xyz.com/collection/846_02_900_x.jpg it download only that image. The question is if there is a possibility to set it to increase certain values automatically such as 846, and download accordingly, without changing any other numbers ?
Last edited by memo007; 08-07-2008 at 12:19 AM.
|
|
|
08-07-2008, 12:08 AM
|
#2
|
Senior Member
Registered: Jun 2008
Posts: 2,529
Rep:
|
Wget doesn't give you the magic ability to scan a web sites directories. Wget follows links. If a page has links to those images, wget can be told to follow the links to download the images.
Alternatively, you can create a file that contains those links, and have wget iterate over those.
|
|
|
08-07-2008, 12:17 AM
|
#3
|
Member
Registered: Feb 2005
Distribution: Debian, Kanotix, Kubuntu
Posts: 117
Original Poster
Rep:
|
I don't necessarily need to scan it but only specify what images to download from one number to the other..
like from 847_02_900_x.jpg to 10000_02_900_x.jpg download all images..
To simply specify a numeric sequence or a range ...
Quote:
Originally Posted by Mr. C.
Wget doesn't give you the magic ability to scan a web sites directories. Wget follows links. If a page has links to those images, wget can be told to follow the links to download the images.
Alternatively, you can create a file that contains those links, and have wget iterate over those.
|
|
|
|
08-07-2008, 12:34 AM
|
#4
|
Senior Member
Registered: Jun 2008
Posts: 2,529
Rep:
|
Right, so you need a script that can generate the file names. You can take it from here:
Code:
for i in $(seq 847 10000) ; do
echo ${i}_02_900_x.jpg
done
|
|
|
08-07-2008, 12:38 AM
|
#5
|
Member
Registered: Feb 2005
Distribution: Debian, Kanotix, Kubuntu
Posts: 117
Original Poster
Rep:
|
Thanks, but after i paste this i get numbers?
How do i use this with wget?
Quote:
Originally Posted by Mr. C.
Right, so you need a script that can generate the file names. You can take it from here:
Code:
for i in $(seq 847 10000) ; do
echo ${i}_02_900_x.jpg
done
|
|
|
|
08-07-2008, 12:47 AM
|
#6
|
Senior Member
Registered: Jun 2008
Posts: 2,529
Rep:
|
You get file names...
847_02_900_x.jpg
848_02_900_x.jpg
849_02_900_x.jpg
850_02_900_x.jpg
...
Are these not the file names you want? Replace echo with wget, and add the missing part of the URI to the filename.
|
|
|
08-07-2008, 11:19 AM
|
#7
|
Senior Member
Registered: Sep 2003
Posts: 3,171
Rep: 
|
Why are you trying to harvest a site's images anyway? Do you have permission to suck up their bandwidth this way?
If I catch you doing that on my site, I'll blacklist you...and my site watches for exactly that, so it likely will catch you.
|
|
|
08-08-2008, 02:34 PM
|
#8
|
Member
Registered: Feb 2005
Distribution: Debian, Kanotix, Kubuntu
Posts: 117
Original Poster
Rep:
|
Grow up man...
Quote:
Originally Posted by jiml8
Why are you trying to harvest a site's images anyway? Do you have permission to suck up their bandwidth this way?
If I catch you doing that on my site, I'll blacklist you...and my site watches for exactly that, so it likely will catch you.
|
|
|
|
08-08-2008, 02:37 PM
|
#9
|
Member
Registered: Dec 2004
Location: Raleigh, NC
Distribution: CentOS 2.6.18-53.1.4.el5
Posts: 770
Rep:
|
wget -r -l5 --no-parent -A.jpg www.xyz.com/collection/
the l5 is the number of levels it goes down. So l5 would go 5 levels down from collection
|
|
|
08-08-2008, 02:50 PM
|
#10
|
LQ Guru
Registered: Oct 2005
Location: Northeast Ohio
Distribution: linuxdebian
Posts: 7,249
Rep: 
|
come on.. scraping multiple images in this fashion is the fastest way to grow your Pr0n collection..
But seriously though, all kidding aside, have you looked at httrack ? it's another option you could try.
Quote:
httrack
Description: Copy websites to your computer (Offline browser)
HTTrack is an offline browser utility, allowing you to download a World Wide website from the Internet to a local directory, building
recursively all directories, getting html, images, and other files from the server to your computer.
HTTrack arranges the original site's relative link-structure.
|
|
|
|
11-13-2008, 10:46 AM
|
#11
|
Member
Registered: Mar 2005
Location: Cascade Mountains WA USA
Distribution: Linux From Scratch (LFS)
Posts: 149
Rep:
|
Wget Stuff
If you know the range of URLs (image or otherwise) then using seq as above you get something like...
Code:
for i in $(seq 1 20)
do
wget http://mybuddies.site.org/blarg/filename_$i.blarg
done
and of course as seen on the command line
Code:
for i in $(seq 1 20); do wget http://mybuddies.site.org/blarg/filename_$i.blarg; done
And yeah. Seriously jiml8.
Your post was completely off topic.
|
|
|
All times are GMT -5. The time now is 04:13 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|