wget with regular expressions.
Hi,
i would to download some html files using wget. The files that are would like to download are: page1.html page2.html page3.html page4.html page5.html I was expecting to download these files using the command: wget http://localhost/page[1-5].html although this option doesn't work. Does any one know a way in using regular expressions with wget for this case? Thanks, |
In general you cannot use wildcards with wget, because the http servers do not provide a way of getting a list of files.
Wildcards are supported for ftp (though you would need to quote your url, otherwise the shell will attempt to expand the wildcard characters before wget sees them). There are some specific arguments to wget that support wildcards (such as the accept and reject list), but this would only help you if you were doing a recursive wget (eg, if there was a parent page or index page with links to all the pages that interest you), for example: Code:
wget -r -A 'page*.html' www.kidsolr.com |
Hi pedrosacosta,
this Code:
for i in 1 2 3 4 5; do wget http://localhost/page$i.html; done Markus |
Try bash brace expansion or use curl instead of wget.
curl has an inbuilt ability to do this sort of thing. Code:
wget http://localhost/page{1..5}.html # bash brace expansion |
All times are GMT -5. The time now is 03:54 PM. |