brian0918 |
09-09-2007 12:27 AM |
How to mass-download from a site?
I have access to a pay-site hosting thousands of public domain images. Since the pay is only for the access (after all, the images are PD), I should be able to download and freely distribute them to all.
Now it's just a simple matter of putting that into practice. To download an image normally, it opens a page up, which fetches the image through javascript and crafty document.write disguises. In order to save the actual file, you have to right-click the image and go to Save As (in Windows, of course).
So, as a first attempt, I determined how the page URLs were generated (all end with a number that increases by 1 for each consecutive page) - the image urls each have unique hashes, so I couldn't touch them. Then I used a program to generate URLs for me (called urlgen). Then I used a regex editor to put html tags around the url's. Then I tried using Firefox's DownThemAll extension to download each page, hoping it would also download the page's content. It didn't. It only downloaded the html of the page.
I know this would be a helluva lot easier to do in Linux, but in Windows, are there any suggestions for accomplishing this? I'm thinking next I'll try one of those keyboard/mouse button combination recorders, but was hoping someone had a simpler (Windows-based) solution.
Thanks!
|