LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-01-2005, 03:45 PM   #1
dehuszar
Member
 
Registered: Aug 2003
Posts: 41

Rep: Reputation: 15
using curl to DL files from HTTP sites with wildcard


I've been trying to come up with a good way of pushing virus updates to others on my network. We have Linux servers, and Windows workstations, and they only talk to each other via an intermediary server. The Windows workstations are otherwise not connected to the internet.

My goal is to say "wget http://my.antiviruscorp.com/updates/tables*.zip"

I know wget won't support wildcards on HTTP and my AV corp doesn't have an FTP site.

I've been told that curl/libcurl could do what I want, but I'm having the damnest time figuring out how the hell to get it to grab the file I want. I've tried curl'ing the direct link, but that downloads it and posts the binary to my screen, and not a file.

Can anyone give me a heads up on where to begin? The man pages don't seem to illustrate this particular use, nor to the FAQ's on the main curl site.

Otherwise, is there a way to grep a webpage so I can just find out what the full name of the tables file I need to DL is and then just push that name on the end of the wget script as a variable. This would actually be my preferred route because this way I can hack the script to only DL exactly what I want.

Thanks in advance,
Sam
 
Old 11-01-2005, 05:02 PM   #2
dd12
LQ Newbie
 
Registered: Sep 2005
Posts: 28

Rep: Reputation: 15
Sam:

Do they have a filename named 'current' that is a link the to current file to download?

You can do a 'wget -O - http://cool.list.com/ | wget --force-html -i - '
 
Old 11-04-2005, 01:13 PM   #3
dehuszar
Member
 
Registered: Aug 2003
Posts: 41

Original Poster
Rep: Reputation: 15
Sadly, no. It has a 3 letter prefix which stays constant, and then a numeric suffix which changes as they release new tables. I want to be able to write a script which downloads any files beginning with xxx*.zip. I can't seem to make your script download any of the files, just the http content. Is there something else that needs to be there? Some other variable besides the web URL?

Thanks in advance,
Sam

Last edited by dehuszar; 11-04-2005 at 01:16 PM.
 
Old 11-04-2005, 01:48 PM   #4
dd12
LQ Newbie
 
Registered: Sep 2005
Posts: 28

Rep: Reputation: 15
This should do it, although it will download all files of type zip.

You want to download all the zips from a directory on an HTTP
server. You tried wget http://www.server.com/dir/*.zip, but that
didn't work because HTTP retrieval does not support globbing. In
that case, use:

wget -r -l1 --no-parent -A.zip http://www.server.com/dir/

More verbose, but the effect is the same. -r -l1 means to retrieve
recursively, with maximum depth of 1. --no-parent means that ref-
erences to the parent directory are ignored, and -A.zip means to
download only the GIF files. -A "*.zip" would have worked too.

The best solution would be:

wget -r -l1 --no-parent -A "XXX*.zip" http://www.server.com/dir/

The results would be in a local directory on your box.
 
Old 11-04-2005, 02:30 PM   #5
dehuszar
Member
 
Registered: Aug 2003
Posts: 41

Original Poster
Rep: Reputation: 15
Tried your snippet, here's what I get:

Code:
wget -r -l1 --no-parent -A.zip http://www.MyAVCorp.com/ftp/products/pattern/
--14:15:51--  http://www.MyAVCorp.com/ftp/products/pattern/
           => `www.MyAVCorp.com/ftp/products/pattern/index.html'
Resolving www.MyAVCorp.com... 68.22.73.177, 68.22.73.201
Connecting to www.MyAVCorp.com|68.22.73.177|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
14:15:51 ERROR 404: Not Found.

Removing www.MyAVCorp.com/ftp/products/pattern/index.html since it should be rejected.
unlink: No such file or directory

FINISHED --14:15:51--
Downloaded: 0 bytes in 0 files
Now if I manually go to the directory with the update zip (/ftp/products/pattern/, it's empty, yet when I fill in the explicit zip filename it downloads. There is no index.html. It appears to be an empty folder. Despite the /ftp/ in the address, I cannot seem to get it to talk to the site using an ftp protocol, and their support staff insists that there is no ftp.

I'd appreciate any further insight.

Thanks in advance,
Sam
 
Old 11-07-2005, 10:32 AM   #6
dd12
LQ Newbie
 
Registered: Sep 2005
Posts: 28

Rep: Reputation: 15
Sam I can't resolve the address you provided to debug. I tried the IP's listed and they don't work either. If you modified the IP's or name you can contact me directly. At this point, I don't know what else to try. If the site requires you to login then you may need to enter that info into your wget options.
 
Old 08-04-2006, 04:36 PM   #7
dehuszar
Member
 
Registered: Aug 2003
Posts: 41

Original Poster
Rep: Reputation: 15
I was doing a little work on the system and needed to refer back to the thread here. Realized I never posted the working solution. This will manually DL antivirus updates from Trend Micro.

Code:
for file in `lynx -dump http://www.trendmicro.com/download/product.asp?productid=32  | grep 'lpt' | awk {'print $2'}`;do wget $file;done
Hope it is helpful to someone else.

Sam
 
Old 09-03-2009, 04:52 AM   #8
massoo
LQ Newbie
 
Registered: Mar 2004
Posts: 12

Rep: Reputation: 0
works great

works great thanks dehuszar.

shann
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mandrake HTTP Root/Config Files bky1701 Mandriva 1 07-27-2005 04:36 PM
transfer files over the web, ftp, http or other? buffed317 Linux - General 7 03-08-2005 09:25 PM
wheres the beef?! mirror sites lack files... Fascistchicken Slackware 8 10-27-2004 11:01 PM
Why doesn't a wildcard chmod change "dot" files/directories? jht2k Linux - General 1 08-09-2004 02:31 PM
mplayer1.0pre3 have problem playing http files ixogn Linux - Software 0 02-23-2004 07:33 PM


All times are GMT -5. The time now is 06:59 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration