LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-22-2009, 03:19 PM   #1
powah
Member
 
Registered: Mar 2005
Distribution: FC, Gentoo
Posts: 276

Rep: Reputation: 30
download all the files of Version A3(1.0) on the web page


How to write a script to download all the files of Version A3(1.0) on the web page
ftp://ftp-sj.cisco.com/pub/mibs/supp...portlist.html?
 
Old 05-24-2009, 01:36 AM   #2
Alien_Hominid
Senior Member
 
Registered: Oct 2005
Location: Lithuania
Distribution: Hybrid
Posts: 2,247

Rep: Reputation: 53
Code:
lynx -dump "ftp://ftp-sj.cisco.com/pub/mibs/supportlists/ace-appliance/ace-appliance-supportlist.html?" | grep -o "*.my" >file.txt
You should get all file names. Now you could use wget.
 
Old 05-24-2009, 03:07 AM   #3
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
if you have Python
Code:
#!/usr/bin/env python
import urllib2
url="ftp://ftp-sj.cisco.com/pub/mibs/supportlists/ace-appliance/ace-appliance-supportlist.html"
page=urllib2.urlopen(url)
f=0
links=[]
for item in data:
    if "</table>" in item: f=0
    if "Version" in item and "A3" in item and "1.0" in item: f=1
    if f and "href" in item:        
        item=item.replace('href="',"").strip()
        ind=item.index('">')
        links.append(item[:ind]) #grab all ftp links 
# download all links
for link in links:
    filename=link.split("/")[-1]
    print "downloading ... " + filename
    u=urllib2.urlopen(link)
    p=u.read()
    open(filename,"w").write(p)
 
Old 05-25-2009, 09:54 AM   #4
powah
Member
 
Registered: Mar 2005
Distribution: FC, Gentoo
Posts: 276

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by Alien_Hominid View Post
Code:
lynx -dump "ftp://ftp-sj.cisco.com/pub/mibs/supportlists/ace-appliance/ace-appliance-supportlist.html?" | grep -o "*.my" >file.txt
You should get all file names. Now you could use wget.

grep -o "*.my"
will create an empty file.txt.
I do this:
lynx -dump "ftp://ftp-sj.cisco.com/pub/mibs/supportlists/ace-appliance/ace-appliance-supportlist.html?" | grep ".my" >file.txt

file.txt is:
href="ftp://ftp.cisco.com/pub/mibs/v2/CISCO-AAA-SERVER-EXT-MIB.my">CISCO-
class=SpellE>MIB.my</SPAN><BR></A><A
href="ftp://ftp.cisco.com/pub/mibs/v2/CISCO-AAA-SERVER-MIB.my">CISCO-AAA-
class=SpellE>MIB.my</SPAN></A><BR><A
href="ftp://ftp.cisco.com/pub/mibs/v2/CISCO-ENHANCED-SLB-MIB.my">CISCO-EN
class=SpellE>MIB.my</SPAN></A><BR><A
href="ftp://ftp.cisco.com/pub/mibs/v2/CISCO-ENTITY-VENDORTYPE-OID-MIB.my"
class=SpellE>MIB.my</SPAN></A><BR><A
href="ftp://ftp.cisco.com/pub/mibs/v2/CISCO-IF-EXTENSION-MIB.my">CISCO-IF
class=SpellE>MIB.my</SPAN></A><BR><A
href="ftp://ftp.cisco.com/pub/mibs/v2/CISCO-IP-PROTOCOL-FILTER-MIB.my">CI
class=SpellE>MIB.my</SPAN></A><BR><A
...
 
Old 05-25-2009, 10:03 AM   #5
powah
Member
 
Registered: Mar 2005
Distribution: FC, Gentoo
Posts: 276

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by ghostdog74 View Post
if you have Python
Code:
#!/usr/bin/env python
import urllib2
url="ftp://ftp-sj.cisco.com/pub/mibs/supportlists/ace-appliance/ace-appliance-supportlist.html"
page=urllib2.urlopen(url)
f=0
links=[]
for item in data:
    if "</table>" in item: f=0
    if "Version" in item and "A3" in item and "1.0" in item: f=1
    if f and "href" in item:        
        item=item.replace('href="',"").strip()
        ind=item.index('">')
        links.append(item[:ind]) #grab all ftp links 
# download all links
for link in links:
    filename=link.split("/")[-1]
    print "downloading ... " + filename
    u=urllib2.urlopen(link)
    p=u.read()
    open(filename,"w").write(p)
On my FC6 linux computer:
$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

I put your script in a file and run it:
$ ~/python/downloadFile.py
Traceback (most recent call last):
File "/home/powah/python/downloadFile.py", line 7, in ?
for item in data:
NameError: name 'data' is not defined
 
Old 05-25-2009, 07:13 PM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by powah View Post
On my FC6 linux computer:
$ python
Python 2.4.4 (#1, Oct 23 2006, 13:58:00)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

I put your script in a file and run it:
$ ~/python/downloadFile.py
Traceback (most recent call last):
File "/home/powah/python/downloadFile.py", line 7, in ?
for item in data:
NameError: name 'data' is not defined
Code:
.....
links=[]
data=page.read().split("\n") <<---- insert this line
for item in data:
........
 
Old 05-25-2009, 10:49 PM   #7
powah
Member
 
Registered: Mar 2005
Distribution: FC, Gentoo
Posts: 276

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by ghostdog74 View Post
Code:
.....
links=[]
data=page.read().split("\n") <<---- insert this line
for item in data:
........
It works.
Thanks!
 
Old 05-25-2009, 10:55 PM   #8
powah
Member
 
Registered: Mar 2005
Distribution: FC, Gentoo
Posts: 276

Original Poster
Rep: Reputation: 30
download all files from the web page

I want to download all files from the web page
ftp://ftp-sj.cisco.com/pub/mibs/supp...pportlist.html.

I modify the script:
#!/usr/bin/env python
import urllib2
url="ftp://ftp-sj.cisco.com/pub/mibs/supportlists/vpn3000/vpn3000-supportlist.html"
page=urllib2.urlopen(url)
f=0
links=[]
data=page.read().split("\n")
for item in data:
if "href" in item:
item=item.replace('href="',"").strip()
ind=item.index('">')
links.append(item[:ind]) #grab all ftp links
# download all links
for link in links:
filename=link.split("/")[-1]
print "downloading ... " + filename
u=urllib2.urlopen(link)
p=u.read()
open(filename,"w").write(p)






Running the script has the following error. Please help. Thanks.
$ ~/python/downloadFile2.py
downloading ... v2
downloading ... ADMIN-AUTH-STATS-MIB.my
downloading ... ALTIGA-ADDRESS-STATS-MIB.my
downloading ... ALTIGA-BMGT-STATS-MIB.my
downloading ... ALTIGA-CAP.my
Traceback (most recent call last):
File "/home/powah/python/downloadFile2.py", line 20, in ?
u=urllib2.urlopen(link)
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 358, in open
response = self._open(req, data)
File "/usr/lib/python2.4/urllib2.py", line 381, in _open
'unknown_open', req)
File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain
result = func(*args)
File "/usr/lib/python2.4/urllib2.py", line 1053, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib2.URLError: <urlopen error unknown url type: <dd><a ftp>

Last edited by powah; 05-25-2009 at 10:59 PM.
 
Old 05-25-2009, 11:59 PM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by powah View Post
I want to download all files from the web page
ftp://ftp-sj.cisco.com/pub/mibs/supp...pportlist.html.

I modify the script:
Code:
#!/usr/bin/env python
import urllib2
url="ftp://ftp-sj.cisco.com/pub/mibs/supportlists/vpn3000/vpn3000-supportlist.html"
page=urllib2.urlopen(url)
f=0
links=[]
data=page.read().split("\n")
for item in data:
    if "href" in item:     
        item=item.replace('href="',"").strip()
        ind=item.index('">')
        links.append(item[:ind]) #grab all ftp links 
# download all links
for link in links:
    filename=link.split("/")[-1]
    print "downloading ... " + filename
    u=urllib2.urlopen(link)
    p=u.read()
    open(filename,"w").write(p)
put your code in code tags next time

Code:
import urllib2,os,urlparse
url="ftp://ftp-sj.cisco.com/pub/mibs/supportlists/vpn3000/vpn3000-supportlist.html"
page=urllib2.urlopen(url)
f=0
links=[]
data=page.read().split("\n")
for item in data:
    if "href" in item:        
        ftpind=item.index("ftp://")
        item=item[ftpind:]
        ind=item.index('">')
        links.append(item[:ind]) #grab all links 
# download all links
for link in links:
    filename=link.split("/")[-1]
    print "downloading ... " + filename
    u=urllib2.urlopen(link)
    p=u.read()
    open(filename,"w").write(p)
to troubleshoot your code, put print statements. also, please read up on Python if you want to use it. See my sig.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to write a simple C program in Linux to download a web page? rajeshcurl Linux - Software 3 04-09-2009 07:55 AM
Download all files from a web page chadefa1 Linux - Newbie 2 05-31-2008 08:25 PM
Web page download/backup Hexane Linux - Software 2 01-20-2005 12:40 AM
Jerky mouse when web browsers download web page stodge Linux - Software 1 07-08-2003 10:29 PM
using a small java program to download a web page mrtwice Programming 3 04-23-2003 07:44 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration