grabbing linked .svg files from a html page with wget

silviolorusso · 10-20-2011, 03:33 PM

Hello,

I'm trying to download all the linked svg from the following link with wget:

http://openclipart.org/api/search/?query=water

This is the line I tried without success:

Code:

wget -r -l1-np -nd -p -A.svg http://openclipart.org/api/search/?query=water

Any suggestion?

Thanks!

theNbomr · 10-21-2011, 07:07 PM

It seems to be an RSS feed, and doesn't return HTML. If you use your browser to 'View Source', you can see that it is not HTML. My guess is that while browsers understand the RSS XML content, wget does not. I used X copy/paste to capture the RSS XML to a file, and then used the following to grab the content:

Code:

for svg in $(perl -e 'while(<>){ if( $_ =~ m/enclosure url="([^"]+\.svg)"/ ){ print $1,"\n";} }' LQsilviolorusso.xml); do 
    wget $svg; 
done

This violates my own insistence not to parse XML without a proper XML parser, but as a one-off, I'll live with myself.

--- rod.

silviolorusso · 10-29-2011, 07:27 AM

Great thanks!