Latest LQ Deal: Latest LQ Deals
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 11-24-2010, 06:11 PM   #1
Registered: Sep 2006
Posts: 374

Rep: Reputation: 16
wget with regular expressions.


i would to download some html files using wget.
The files that are would like to download are:

I was expecting to download these files using the command:
wget http://localhost/page[1-5].html

although this option doesn't work.

Does any one know a way in using regular expressions with wget for this case?

Old 11-24-2010, 06:26 PM   #2
Senior Member
Registered: Jan 2005
Location: Melbourne, Australia
Distribution: Debian Buster (Fluxbox WM)
Posts: 1,390
Blog Entries: 52

Rep: Reputation: 359Reputation: 359Reputation: 359Reputation: 359
In general you cannot use wildcards with wget, because the http servers do not provide a way of getting a list of files.

Wildcards are supported for ftp (though you would need to quote your url, otherwise the shell will attempt to expand the wildcard characters before wget sees them).

There are some specific arguments to wget that support wildcards (such as the accept and reject list), but this would only help you if you were doing a recursive wget (eg, if there was a parent page or index page with links to all the pages that interest you), for example:
wget -r -A 'page*.html'
(though it will have to recurse through all the files in order to find the ones named 'page*.html', which can waste bandwidth)

Last edited by neonsignal; 11-24-2010 at 06:36 PM.
Old 11-24-2010, 06:31 PM   #3
Senior Member
Registered: Apr 2007
Location: Germany
Distribution: Slackware
Posts: 3,979

Rep: Reputation: Disabled
Hi pedrosacosta,

for i in 1 2 3 4 5; do wget http://localhost/page$i.html; done
will work for you.

Old 11-24-2010, 06:52 PM   #4
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 356

Rep: Reputation: 166Reputation: 166
Try bash brace expansion or use curl instead of wget.
curl has an inbuilt ability to do this sort of thing.
wget http://localhost/page{1..5}.html      # bash brace expansion

curl -o 'page#1.html' 'http://localhost/page[1-5].html'

Last edited by Kenhelm; 11-24-2010 at 07:24 PM. Reason: Added "-o 'page#1.html' " to create output file names
1 members found this post helpful.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expressions Khaj.pandey Linux - Newbie 19 04-22-2010 12:09 AM
regular expressions. stomach Linux - Software 1 02-10-2006 07:41 AM
Regular expressions aromes Linux - General 1 10-15-2003 01:29 PM
regular expressions? alaios Linux - General 2 06-11-2003 04:51 PM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 09:06 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration